DOCOMENT RESUME 



95 TB 004 864 

Nafziger^ Dean H«; And Others 

Tests of Functional Adult Literacy: In Evaluation of 

Currently Available Instruments. 

Northweot* Regional Educational Lab.^ Portland^ 

Oreg. 

Office of Education (DHEW) , Washington^ D.C. Office 
of Planning^ Budgeting^ and Evaluation, 
aun 75 

OEC-300-75-00S8 
108p. 

MF-$0.76 HC-$5.70 PLUS POSTAGE 

Adult Literacy; *Adults; Criterion Referenced Tests; 
Evaluation; Evaluation Criteria; *Functional 
Illiteracy; *Peading Tests; Standardized Tests; 
Testing; *T9st Reviews; * Tests; Test Validity 



Currently available measures of functional literacy 
for adults are reviewed and evaluated. This report concentrates on 
tests that are referenced to literary skills important to an 
adequately functioning adults such as life skills^ coping skills^ 
etc. Because functional literacy has frequently been defined in terms 
of a grade level equivalent or some other norro^ adult reading tests 
referenced to a norm group are also included. A common set of 40 
criteria categorized under four main headings are used: measurement 
validity^ examinee appropriateness^ technical excellence^ and 
administrator usability. The report provides teachers and 
administrators in Right-to-Pead and other adult education programs a 
reference for use in identifying and judging the value of tests 
available for assessing adult functional literacy. To increase its 
utility as a reference^ summaries of a number of tests designed for 
adults are included. The report consists of six major parts: (1) 
Problems in Defining and Measuring Literacy; (2) Test Identification; 
(3) Evaluative Criteria; (4) Test Reviews; (5) Test Evaluations; and 
(6) Summary. Because many tests of functional literacy are newly 
developed or still being developed^ there may be tests which should 
have been-*but could not be-*-included in this report. No one set of 
criteria is appropriate for judging all tests. Thus^ these test 
evaluations must be interpreted with respect to the intended use of 
each test. (Author/RC) 



ED 109 265 

AUTHOR 
TITLE 

INSTITUTION 

SPONS AGENCY 

PUB DATE 
CONTRACT 
NOTE 

LDR-S PRICE 
DESCRIPTORS 



ABSTRACT 



* Documents acquired by ERIC include many informal unpublished * 

* materials not available from other sources. ERIC makes every effort * 

* to obtain the best copy available, nevertheless* items of marginal * 

* reproducibility are often encountered and this affects the quality * 

* of the microfiche and hardcopy reproductions ERIC makes available * 

* via the ERIC Document Reproduction Service (EDRS) . EDRS is not * 

* responsible for the quality of the original document. Reproductions * 

* supplied by EDRS are the best that can be made from the original. * 

4c 4c 4c 4c )|t 4c 4c 4c 4c 4c 4c 4c 4c 4c 4c 4c 4c 4t 4c 4e 4e 4c 4e 4c 4c 4t 4t 4t 4t 3(e 4c ]^ 



Tests of Functional Adult Literacy: 
An Evaluation of Currently Available Instruments 



Dean H. Nafziger 

R. Brenu Thompson 
* Michael Hlscox 

Thomas R. Owen 



Assessment Projects 
Northwest Regional Educational Laboratory 
710 S* W. Second Avenue 
Portland, Oregon 9720A 

June 1975 



Prepared for the 
U, S. Office of Education 
Office of Planning, Budget and Evaluation 
Contract No. 300-75-0098 



The work presented or reported herein was performed pursuant 
to a Contract from the U. S. Office of Education, Department 
of Health, Education, and Welfare. However, the opinions 
expressed herein do not necessarily reflect the position or 
policy of the U. S. Office of Education, and no official 
endorsentent by the U. S. Office of Education should be inferred. 



f 
i 
i 



4 



ACKNOWLEDGMENTS 



Special thanks are due many pe^le who assisted In the project and 
In the preparation of this report. ( The authors are grateful to Robert C. Hall 
of the U. S. Office of Education for his guidance and assistance during all 
phases of the evaluation. We wish to thank W. Aubrey Gardner and James 
Sanders for their assistance in planning and conceptualizing the study, rnd 
for their guidance and consultation throughout the project. Also, we 
appreciate the work of Diane S. Thompson and Ann Helmick in preparing, parts 
of the report; and of Robert J. Silverman and William J. Wright, who provided 
many valuable suggestions on an earlier draft of the manuscript. 

The contributions of Vicki Spandel in editing the report and of Julie 
Stange, Peggy Hootstein, and Trena Jabbour in their careful preparation of 
the manuscript are especially appreciated* 

Finally, we express our gratitude to many professionals in Right-to-Read 
and adult education progrjams at the federal, state, and local levels who 
answered our written and oral requests for information. Naturally, the authors 
accept all responsibility for the contents of the report; and no responsibility 
for any shortcomings inherent in this report should be assigned to those who 
gave generously of their time in its preparation. 



TABLE OF CONTENTS 

Page 

ACKNOWLEDGMENTS iii 

TESTS OF FUNCTIONAL ADULT LITERACY: 

AN EVALUATION OF AVAILABLE INSTRUMENTS 1 

Organization of the Report ^ • • • | 3 

PROBLEMS IN DEFINI^TG AND MEASURING FUNCTIONAL LITERACY 5 

Estimates of the Extent of Literacy I 7 

Literacy Definitions \ 11 

Functional Literacy \ 13 

Choosing Tasks to be Measured • • » 14 

Sunmary 15 

TEST IDENTIFICATION . . . • 17 

EVALUATIVE CRITERIA 21 

Measurement Validity 24 

Examinee Appropriateness • • 27 

Technical Excellence 31 

Administrative Usability 32 

TEST REVIEWS 39 

CRITERION-REFERENCED FUNCTIONAL LITERACY TESTS 45 

Adult Performance Level Functional Literacy Test (APL) 45 

Basic Reading Skills Mastery Test 48 

Reading/Everyday Activities in Life (R/EAL) 50 

Wisconsin Test of Adult Basic Education (WITABE) 53 

STANDARDIZED TESTS 57 

Adult Basic Learning Examination (ABLE), Level I 57 

Basic Occupational Literacy Test (BOLT), Fundamental Level . . . • 60 

General Educational Performance Index (GEPI) 63 

SRA Reading Index 66 

Tests of Adult Basic Education (TABE), Level E 69 



iv 



TABLE OF CONTENTS (Continued) 



Page 

INFORMAL TESTS <>. 72 

Adult Basic Reading Inventory 72 

Cyzyk Prg-Reading Inventory 76 

Harris Graded Word List and the Informal Textbook Test i 78 

Idaho State Penitentiary Informal Reading Inventory 81 

An Informal Reading Inventory for Use by Teachers of 

Adult Basic Education ' 85 

Individual Reading Placement Inventory 88 

Initial Testing Locator Tests 91 

Reading Evaluation^- Adult Diagnosis (READ) 93 

TEST EVALUATIONS 97 

SUMMARY i 109 

FOOTNOTES 115 

FilBLIOGRAPHY 121 



o " ■ 7 

ERIC^ V 



TESTS OF FUCTIONAL ADULT LITERACY: 
AN EVALUATION OF CURRENTLY AVAILABLE INSTRUMENTS 



Adult illiteracy was recently designated a major target area of the 
Right -to-Read program in the United States. The extent of their commitiment 
to reduce adult illiteracy is reflected in a national goal of the Right-to- 
Read program: To eliminate functional illiteracy by 1980 among 90% of the 
population over 16 years of age*^ In particular, Right-to- Read seeks to 
teach necessary reading skills to adults who have not been successful 
participants in society. Increasing emphasis on functional literacy has led 
to a proliferation of reading programs designed to teach reading tasks 
important to social survival. The desire to determine the efficacy of these 
programs has led, in turn, to a need for instruments that measure functional 
literacy. 

The purpose of this report is. to review and evaluate currently available 

measures of functional literacy. The report concentrates on tests that are 

referenced to literacy skills important to an adequately functioning adult. 

These skills have been referred to as life skills, survival skills, coping 
2 

skills, ar>d so on. Because functional literacy has frequently been defined 
in terms of a grade level equivalent or some other norm, adult reading tests 
referenced to a norm group are also included. A common set of criteria, 
which address characteristics important for any test, were used to evaluate 
all tests included in this report. 

The report summarizes the current availability of tests of adult 
functional literacy. It is also intended to provide administrators and teachers 

0 

in Right-to-Read and other adult education programs a reference for use In 
identifying and judging the value of tests available for assessing adult 

8 



cfunctlonal literacy. To Increase Its utility as a reference, the report 

includes suinmarles of a number of tests designed for adults. 

It is also important to note what this report does not attempt to 

provide. First, the contractual mandate of this study was to review and 

evaluate only those tests developed strictly for adults. Therefore, this 

report does not provide a comprehensive listing of all tests used in measuring 

adult reading ability, since many such tests were developed for children; 

3 

excellent resources that list these tests are already available. Second, 
this report does not identify and evaluate tests which are inextricably bound 

0 

to specific instructional materials, curricula, or programs; only tests 
appropriate for general use are listed, 

^ In addition, the report , has certain limitations. Because many tests 
of functiorfi^l-^^iteracy are newly developed or still being developed, their 
existence Is not widely knoxm. Despite the national mail survey that preceded 
this report, some such tests may not have been identified and included* 
Also, some authors requested that their tests be excluded from consideration 
until further work on them was completed. As a result, there may be tests 
which should have been — but could not be--lncluded in this report* 

Another limitation concerns the fact that no one set of criteria is 
appropriate for judging all tests. Most tests have some unique characteristics 
that bring into question the applicability of some criteria. This problem is 
intensified when different kinds of tests — e.g., norm-referenced and criterion- 
referenced — are being judged by the same criteria. Moreover, standard criteria 
may not reflect the Interests or priorities of a particular audience for 
eva'^uation results. Thus, the reader must Interpret the test evaluations 
in this report with respect to the inteifded use of each test. 



2 



9 



A further limitation lies in the fact that the tests v;ere evaluated 
accotding to existing public data provided by the test authors or publishers. 
On criteria for which no data were available, tests received unfavorable 
evaluation. While this approach was judged by the report authors to be 
the most equitable known, it has the disadvantage of appearing overly stringent 
in relation to tests still in the early stages of development. 

Organization of the Report 

Following this introductory section, this report consists of six major 
parts: o 

1. The Problems in Defininj^ and Measuring Literacy . This section 
includes estimates of the extent of illiteracy, definitions 

of literacy, notions of functional eracy, and problems in 
choosing tasks to measure literacy. 

2. Test Identif ication t This section includes a discussion of three 
major activities undertaken to identify tests: a literature 
search, requests to publishers and professionals involved in 
adult education, and telephone and personnel interviews with 
persons active in teaching and measuring adult reading. 

3. Evaluative Criteria . This section includes a discussion of 

the 41 criteria used in evaluating the tests, and an explanation 
of how these criteria are categorized under four main headings:" 
measurement validity, examinee appropriateness, technical 
excellence, and administrative usability. 

4. Test Reviews . This section presents descriptive reviews of the 
tests, which are grouped into three subsections: criterion-referenced 
functional literacy tests, standardized tests, and informal tests. 

5. Test Evaluations . The section presents evaluations of the 

tests, which are grouped according to the i^ame overall organization 
as the test reviews^ 

6. Summary . This section notes some general strengths and weaknesses of 
different types of tests. It also describes continuing work related 
to the measurement of functional literacy being conducted by three 
groups. 



erIc ^0 



PROBLEMS IN DEFINING AND MEASURING FUNCTIONAL LITERACY 



Conducting adult literacy assessment necessarily requires an under- 
standing of vhat literacy is. Achieving that understanding Is difficult 
because literacy Is not a solitary trait; It comprises many sub-skills. 
Also, one might be considered literate (able to comprehend) In some content 
areas, though not In others. • 

The multlfaceted nature of literacy has often been glossed over through 
the use of such composite scores as standard scores and grade level 
equivalents. For example, one might say, "He Is reading at grade level 7.2"; 
In a very general way, this kind of normative statement relates a particular 
person *s performance on some unknown reading task to the performance of 
others at a particular — In, this case educational— level. It Is not usually 
clear how this level of performance would relate to any other possible 

o 

literacy tasks. One could argue that, with young children, general reading 
ability that can be applied to a broad range of tasks Is most Important to, 
consider; hut with adults, especially those who are only marginally literate, 
one Is more concerned with whether they can perform particular sets of llfe- 
or work-related literacy tasks. 

Estimates of the Extent of Illiteracy 

Estimates of the extent of illiteracy in the United States vary considerably, 

depending somewhat on the method of assessment used. The Census Bureau 

considers literate anyone 14 years of age or older who has completed sixth 
4 

grade. Those who wave not completed the sixth grade are asked whether they 
can read and write a simple message in any language; if they say, "Yes," they 
are considered literate. Based on this method, it is estimated that 
approximately one percent of those aged 14 years and older are illiterate. 

ER?C 11 



0 

7 



However, the self report feature of the inquiry, the concern that the 
description "simple messages" may not be adequate, and the uncertainty 
about the relationship of literacy to graduating from the 6th grade 
together cast considerable doubt on the Census Bureau^s method of estimating 
literacy — except perhaps as a way of deriving a lower bound estimate. 

Assessing levels of literacy using grade level equivalent scores on 
some type of reading test is a common practice. The National Center for 
Health Statistics has conducted a survey using their Brief Test of Literacy, 
which shov;s that 4.6 percent of individuals 12-17 years old score'below the 
average Atli grader on the instrument and can therefore be regarded as illiterate 
This method of assessment is not useful because it Is uncertain what specific 
performances are implied by success on the test. 

Indices of literacy such as those discussed here may be useful at the 
"first guess" level. They are inadequate beyond that point, however, either 
because they do not relate directly to literacy, or because they do not 
permit inferences about what sorts of functional competencies given levels 
of literacy imply. . 

To obtain a more useful estimate of the extent of illiteracy some recent 
work has been done to define what literacy-related tasks adult members of 
this society must perform, and to build assessment instruments that measure 
performance on those tasks. Certainly the best publicized of these attempts 



was made by the Harris survey team, who were commissioned by the National 
Reading Center to conduct a study of adult functional illiteracy.^ They 
asked respondents to read and fill in the appropriate , information on five 
forms — Application for Public Assistance, Application for Medicaid, 
application for a driver's license, personal identification form, and a personal 
loan application. Using the criterion of 90 percent correct responses on 



12 




8 



these forums, Harris reports that 13 percent of their sample, or an 

estimated 18.5 million Americans, fell below that level— that Is, v;ere 

marginally literate to functionally Illiterate in terms of ab. '.ty to perform 

these tasks. (It has been asserted by some that these data are statistically 

incorrect, and that the correct estimate, based on, the 1970 Harris survey, 

should have been 6.5 percent b*elow the literacy leveK)^ While the range 

of literacy tasks employed In this * llmlitc^d, the tasks do represent' 

some Sf the common literacy tasks which adults 'are required to perform. 

A second survey, conducted by Harris in 1971, e3q)lored respondents^ 

ability to successfully answer straightforward questions about newspaper 

^ployment -advertisements. Ninety-two percent of the total sample got all 

nin?' of the questions correct, although only 70 percent of all Blacks 

tested got nine correct. Survey personnel bbtained siirrilar results using 

classified housing advertisements; 88 percent of those surveyed got all 

8 

items correct. Blacks averaged 67 percent correct. 

/thus, it appears that, using several literacy tasks chosen simply as ' 
examples, the national level of marginal to complete illiteracy might encompass 
around ten percent of the population, and might be much higher among some 
minorities. Th^se data also show higher illiteracy rates for low income dnd 
low education groups. As instruments for assessing literacy, however, neither 
the representativeness of the tasks nor the performance levels used, have 
any empirical support. 

i 

In order to produce a valid set of tasks for assessing adult competencies, 
Norvell Northcutt of the Adult Performance Level (APL) Project conducted an 
extensive literature search, surveying governmental agencies and" foundat-ions 
to determine the characteristics of successful and unsuccessful adults; and 
Interviewed adults who were under-educated and underemployed, employers, and 



personnel specialists* The necessary skills identified during this 1975 
search can be grouped into the following four areas: (a) communication 
skills, (b) computational skills, (c) problem solving skills, and (d) inter- 
personal skills. Northcutt also identifies five generail knowledge areas: 

(a) occupational knowledge, (b) consumer economics, (c) community resources, 

' 9 

(d) government and law, and (e). health* 

Because these skills demand much more than the ability to use or 
comprehend written material, they do not fit comfortably within the concept 
of literacy. Therefore, the'APL staff substituted the term "functional 
competency" for "functional literacy." ;^ 

Using national samples, the Adult Performance Level Project has determined 
that as many as 20 percent of the adult population are functionally incompetent. 
Indeed, in one of the skill areas, computation, it appears that one-third of 
U.S. adults may be functionally incompetent. Only 70 percent of those surveyed 
could indicate the proper number of exemptions on a W-4 form when given the 
number of dependents. On a task requiring the respondents to match personal 
characteristics with job requirements in an employment advertisement, only 
62 percent succeeded. More than 20 percent of those surveyed could not draw 
the proprer conclusions from a notice of a store *s check cashing privileges. 
Overall, the APL project staff estimate that more than 20 percent of .U.S. 
citizens are functionally incompetent at reading — a figure which contrasts 
sharply with the results of earlier surveys. 

It would appear that as the 'tasks used in literacy assessment instruments 
become more like ^'real world" tasks in the sense of requiring composite 
gkills, estimates of the extent of illiteracy increase proportionately. One 
might expect this. It simply indicates that the more marginal a person's 

10 



skills, the more likely he Is to fall at tasks for which the requisite 
skills are Interdependent. 

Bormuth has stated that It is Important to carefully derive both the ' 
literacy behaviors and the acceptable levels for success. -"^^ The Northcutt 
study appears to have surpassed previous studies on the first item, but is 
still arbitrary in assigning criterion levels of success. 

Bormuth 's work includes an example of a different task which has 
been used to assess the extent of functional literacy in a particular population. 
In 1969, he prepared cloze tests on several newspaper passages and tested a 
sample of high school seniors.* He set a level of 35 percent correct as a 
criterion for adequate performance on the test. The 35 percent criterion is 
based on a conclusion Bormuth drew from earlier research: that people with 
cloze scores of 35 percent or less were able to extract very little meaning 
from the passage. Only 65 percent of the. sample correctly answered 35 percent 
of ♦'he cloze terms. 

Literacy Definitions \ 

The preceding discussion offers a general perspective of literacy based 
on the efforts of those who sought to assess levels of literacy. The estimates 
of illiteracy given in that section vary because, there is little consensus 
about what constitutes literacy. The purpose of this section is to further 
examine the differences among conceptions of reading and literacy by presenting 
some common definitions. Consider the following definitions of the reading 
process. Bower commented that 

Reading is a -sequential process in which ongoing processing 
is affected by prior processing ' ^ will determine future 
processing. 



To prepare a cloze passage one deletes every nth word and Iv is the task 
of the reader to fill in the missing words. 



ERLC 



15 



11 



In a similar statement, Goodman said that the reader 



•••concentrates his total prior experience and learning on 
the task, drawing on Ms experiences and the concepts he has 
attained as well as the language competence he has achieved. 

Both emphasize the role jof prior knowledge In facilitating the reading process, 

and couch their definitions in descriptions of what an Individual does * 

Gibson offers a similar description of reading: 

There are several ways to characterize the behavior we call 
reading. It is receiving communication; it is making discrimina- 
tive responses to graphic symbols; it is decoding graphic symbols 
to speech: and it is getting meaning from the printed page.^^ 

These definitions of reading refer primarily to information processing 

mechanisms that* the reader must or may employ, and say little about the nature 

of reading Itself. 

Literacy, in contrast to reading, implies both basic reading .skills and 

socially appropriate reading behavior, and any definition of literacy must 

/ 

Incorporate both. Bormuth offers the following comprehensive definition: 

In the broadest se^se of the word, literacy is the ability to 
exhibit all of the"T>ehaviors a person needs in order to respond 
appropriately ±6 all possible reading tasks. 

OL course, no one is literate to this extent. If literacy is to be a realistic 

goal of an educational program, it must be defined as some subset of the total 

set of reading tasks and the behaviors required to accomplish those tasks. 

Bormuth suggests that this subset be selected on the basis of economic, social, 

cultural, and political. benefits to the individual or his society — that is, 

for pragmatic reasons. 

In recent assessments, tasks assessing literacy have been chosen mere for 

their social utility than for their relationship to presumed underlying dimensions 

of reading. This is consistent with the theory that literacy Involves more 

than reading skills alone. 



lERlC 



12 



16 



Functional Literacy 

The term "functional l:t^teracy" connotes reading for a purpose—a purpose 
in some way related to social utility. William S. Gray defines functional 
literacy as "the ability to engage effectively in all those reading activities 
normally expected of a literate adult in his community. "^^ This definition, 
while circular, does emphasize the fact that certain tasks are required of 
adults by members of their community. The U. S. Office of Education has 
defined a literate person as 

...one who has acquired the essential knowledge and skills 
in reading, wijiting, and computation required for effective 
functioning in society, and whose attainment in such 
skills makes it possible for him to develop new aptitudes 
and to participate actively in the life of his times. 

U.S.O.E. has operationalized this definition by suggesting that adults 

be able to perform the following tasks: 

o Read and understand all sections of a newspaper, with 
particular emphasis on the classified and advertisement 
sections 

o Read the drivers license test In any state 

o Read and understand voter registration instructions 

o Read and comprehend the key features of popular business 
contracts such as those issued by used car dealers, 
furniture stores, clothing shops, and auto repair dealers 

o Read labels on such household items as groceries, 
medicines, recipes, machine instructions, etc. 

o Read the materials necessary to perform jobs classified 
as entry level 

o Read personal letters, bills 

o Read and follow public instructions such as road and 
building signs 

o Read and use the telephone directory 

c o Read and complete job application forms 

o Read and comprehend business letters from debtors and ,^ 
creditors^° 

17 



ERIC 



Sticht defines functional literacy as "a possession of those literacy 
skills needed to successfully perform some reading task imposed by an external 
agent between a reader and a goal the reader wishes to obtain. "^^ He points 
out that this excludes such reading activities as reading for pleasure. Also 
he differentiates between reading to learn a job and reading to do a job. As 
a rule, thp former requires a higher level of literacy than the latter. 

From these definitions and operationalizations of the concept of 
functional literacy, one can infer that some of the major assessment problems 
relate to creating instruments which reflect special concerns and help establish 
the Importance of certain tasks. 

Choosing Tasks to bfe Measured 

One difficulty in choosing tasks to assess functional literacy lies in 
accurately identifying the skills involved. Carver has argued that some of 
the higher order comprehension Items in reading inventories may relate more to 
thinking than to reading itself. ^° Furthermore, successfully completing some 
comprehension items might also relate to one's general knowledge of the subject 
matter. 

Carver also suggests that if it is actually the ability to reason that is 
being assessed, the evaluative judgment one makes about a reading program may 
be distinctly unfair. The same argument may be advanced regarding external 
knowledge or experience and their relationship to reading. One may choose to 
broaden a reading program's educational goals, basing them on performance tasks 
used in functional literacy assessments. Bormuth warns, however, that 
such an approach may commit a program to a much more difficult undertaking 

than anyone realizes. 

Though traditional norm referenced reading tests— particulary the compre- 
hension sections— may be measuring intelligence rather than reading skills, 
that problem cannot be categorically solved simply by shifting to criterion- 

u " 18 



referenced tests. The tasks themselves detei.TBlne what is being measured 

regardless of whether norms are constructed. MacGlnltle argues that: 

Giving a score that refers to some criterion rather 
than to a norm group does not absolve the test maker 
from showing that separate component scores Index 
meaningful skill levels or separately measureable 
skills. 23 

Unlsss one Is very careful, one may be actually assessing language skills, 
intelligence, or general knowledge — even when using a criterion-referenced 
instrument. 

Summary 

Because no standard definition of literacy exists, estimates of the 
extent of illiteracy in the United States vary widely. Recent use of the 
term literacy connotes the ability to perform functional reading tasks — 
i.e., tasks which are important for successful participation in society. 
Because these definitions concern the attainment of a set of minimal skills, 
they imply the need for criterion-referenced tests that will measure the 
attainment of such skills. One consideration in constructing criterion- 
referenced tests of functional literacy is selecting tasks that are important 
for adequately functioning adults. A primary purpose of f!iis report is to 
examine the extent to which measures of adult functional literacy meet this 
and other considerations. 



ERIC 



15 



19 



TEST IDENTIFICATION 

A major part of preparing this report was Identifying tests to be 
Included. Tests and background Information were gathered In three ways. 
First, a literature search was conducted to Identify tes**'', test reference 
books, and articles on current test development efforts. Many tests and some 
major test references, such as those noted earlier, were Identified during 
this activity. 

Second, requests for information were mailed to publishers of adult 
literacy materials and to professionals active in adult education. All known 
publishers of tests of adult education materials were contacted. They were 
asked if all tests they had available for measuring functional adult literacy 
could be purchased. Requests for tests and information from professionals 
were sent to state Rlght-to-Read Directors, State Directors of Adult Basic 
Education (ABE), USOE Staff Development Directors and Program Officers for ABE, 
and directors of programs for adult educators in colleges and universities. 

One hundred twenty-eight <60 percent) of the 212 professionals contacted 
returned questionnaires. In addition, several individuals made copies of 
the questionnaire so that other members of their staff could responi as well. 
Forty-four (56 percent) of the 79 publishers contacted responded to the 
solicitation letters. Follow-up letters were sent as a part of this 
solicitation effort. 

Third, telephone and personal Interviews were conducted with individuals 
active in teaching and measuring adult reading. Those interviewed included 
developers of measurement Instruments, coordinators of adult education programs, 
teachers of adults, and specialists in reading measurement. These interviews 
were conducted for varying reasons — to help identify tests, to gain more 

19 

20 



information about tests already Identified, and to obtain information about 
criteria to be used in evaluating the tests. 



measuring adult reading ability were identified. Most were designed for 
elementary and secondary school students; less than 30 of the tests collected 
had been designed specifically for use with adults. 

By contractual mandate the project focus was on tests developed for 
adults. Therefore, many commonly used tests were excluded because they 
were designed for children rather than adults. The Gray Oral Reading Test 
and the Stanford Diagnostic Reading Test are examples of widely used tests 
which were excluded from this report because they were originally designed 
for children. 



As a result of these activities, approximately 3 50 tests used in 




20 



21 



EVALUATIVE CRITERIA 



Numerous sources were consulted to identify or develop criteria for 
test evaluation. The criteria adopted for this report relied heavily on the 
criteria used by the Center for the Study of Evaluation (CSE) at UCLA in 
their comprehensive test evaluations . The CSE criteria offered two major 
advantages. First, they represented a complete compilation, of generally 
accepted test standards. Second, they had been extensively used by CSE in 
evaluating tests; weaknesses and ambiguities had, therefore^ been largely 
eliminated. 

\ 1 Even so, the CSK criteria prevSented some problems with respect to 
measuring adult literacy. For example, the CSE criteria included one judgment 
that\favors tests v;hich are group administered. But for test-anxious adults, 
a group administered test may not always be the better choice. Therefore, 
the criterion awarding a point for group administration of tests was dropped. 

Furthermore, since the CSE criteria were designed for application to 
a wide range of tests, certain specific concerns in measuring functional 
literacy could not be addtessed* Thus, it was necessary to add questions such 
as: "Are there scales of performance on real-life skills^ (e^g*, map reading, 
understanding want ads, etc.)?** 

Like CSE's criteria, our criteria focused on four major areas: measurement 
validity, examinee appropriateness, technical excellence, and administrative 
usability. Each of these areas consisted of several individual criteria. 
Tests were assigned points indicating the extent to which they met each 
criterion; then the points were totaled for each of the four areas. Finally, 
an area grade of good, fair, or poor was assigned, based upon the^ total points 
obtained for the criteria witbln the area* Within each area the numbers of 

23 

22 



points designating the total grade (I.e., good, fair, poor) were chosen In 
such a way that most of the criteria would have to be a^'tained at the maximum 
level In order for the test to obtain a high grade for the area. 

The criteria were applied to each test as a whole, or subtest by subtest* 
Each test was Independently evaluated by at least two people; differences In 
ratings were adjudicated by a third person. The evaluators all had previous 
experience or training In educational measurement. They were trained In the 
use of the criteria, and their judgments were checked for consistency and 
accuracy during the training. In addition to the evaluations, a descriptive 
review was prepared for each test. These reviews describe the tests, and 
summarize the administration, scoring and Interpretation procedures and the 
available technical data. 

On the following pages, the criteria used to evaluate the tests are 
described. Evaluative decisions were based on Information presented In the 
manuals and supplements accompanying published tests, or on Information 
concerning unpublished tests supplied by test authors at our request. No 
attempt was made to verify available Information. When needed Information 
was not available and was not readily inferrable, a test was credited with 0 
points (the lowest rating) on the relevant criterion. 

Credit and appreciation fare due to Ralph Hoepfner and others at the 

Center for the Study of Evaluation, whose pioneering work we have freely 

24 

borrowed and adapted In arriving at the criteria which follow. Of course, 

we accept sole responsibility for the final set of criteria used In this project 

and for their application. 

Measurement Validity 

a. Is Information provided to Indicate a rigorous selection of Items and 
careful sampling ^Qf the behavior domain? 

23 



b. 



Such Informatidn was considered adequate, provided references on 
the construction of the test were Included, if the procedures used In 
developing test specifications and Items were described In some detail, 
the test was credited with 2 points; If reference was made Indicating 
the use of a specific, rigorous Item selection procedure, the test was 
credited with 1 point; If no Information was provided on Item selection, 
the test was credited with 0 points. 

Were any empirical procedure s used for screening or selecting the Items ? 

Empirical procedures Include Item analyses, juries of experts. Item 
difficulties, criterion-group an^alyses, or factor analyses. If more than 
one method was reported In some detail, the test was credited with 3 
joints; If it was stated that more than one method had been used, or If 
one method was reported In some detail, the test was credited with 2 points; 
If it was stated that one method had been used, the test was credited 
with 1 point* and if no information was given, the test was credited with 
0 points. 

V 

Are the items tied into specified objectives or criteria ? 

If the test Items were generally related to specified objectives 
or criteria (such as tasks from a task analysis), the test was credited 
with 1 point. If items were not generally so related, or if objectives 
or criteria were lacking, the test was credited with 0 points. 

^' Does the construct or type of behavio r that the test purports to measure 
have a supportive base in lingu istic, educational, psychological, or 
learning theory ? 

This criterion was applied to statements describing the theoretical 
bssls of the test or to statements justifying the existence of the test 



c. 



ERIC 



21 



25 



(e.g., *'oral reading scores cbicrelate only slightly with silent reading 
test results; therefore, we felt the need for a separate oral reading 
test, which could possibly be used as part of a more comprehensive 
testing effort"). If the test Included such a statement. It was 
credited with 1 point; if not, it was credited with 0 points. 

Has the test been employed in experiments or evaluations '^ 

If the test scores in such experiments appeared to have yielded 
meaningful re-^ults, the test was credited with 1 point; if not, the test 
was credited with 0 points. 

Are any concurrent validity studies (demonstrating correlat ion with some 
criterion measures obtained at the same time as the Jt^gt) r eported or 
specifically referred to in which the criteria (not other scor es of the 
same test) are related in a meaningful way to the goal behavior t o which 
the test was assij^ed ? 

If the criterion behavior was relevant and the coefficient was 
•70 or more, the test was credited with 2 points; if the coefficient was 
between .30 and .70, or the criterion behavior was not convincingly 
* relevant, the test was credited with 1 point; if no study was reported, 
coefficients were low, or t^e criterion was clearly Irrelevant to the 
nature of the test, 0 points were credited. 

Are any predictive validity studies (the criterion behavior^-usually success 
in some area — is obtained after a stated time interval) reported or 
specifically referred to in which the criteria was related in a mean ingful 
way to the goal behavior to which the test was assigned ? 

If coefficients at or above .70 were reported with relevant criteria 
and c time interval of one month or more, the test was credited with 2 

26 25 



points. It only moderate coefficients (.30 to .70) were reported, 
or the criteria were of questionable meaningfulness, the test was 
credited with 1 point. If no study was reported or referenceH, or the 
study was patently irrelevant, . the test was credited with 0 points. 

The Measurement Validity ratings were summed for a total rating, varying 
from 0 CO 12 points. These ratings were translated into letter grades of 
G (good, 10 to 12 points), F (fair, 6 to 9 points), and P (poor, 0 to 5 points) 

Examinee Appropriateness y 
^- Does the test justify itself by e xplaining to the examinee in ati /hottest 
manner its purpose, intent, or recommended use ? 

Misuse of test scores' was not considered here, since such misuse is 
. impossible to control. If the test (usually the test lnstruction0)\ ' 
specifically stated the real or suggested purpose, intent, or^se of the 
test, or if the manual suggested that such a justification be^^g^en in each 
situation, the test was credited with 1 point. If no purpose, intent,, or 

^ use was specified, if the purpose or intent was disguised or concealed, 

/ 

or if examinees were led to adi>pt ineffective test-taking strategies, the 

/ 

test was credited with 0 points. This criterion was evaluated rather 
liberally in most cases, so that a test whose instructions began, "this is 
a test of your ability to. s^ell,.." was given credit for' justification. 

/ 

^- Arfe> the test items personally inoffensive and appropriate in terms of 
difficulty for adults in basic education or similar settings ? 

If all the items appeared inoffensive and reasonably appropriate in 
difficulty level, the test was credited with 2 points. If most Items 
appeared appropriate or there were few serious typographical errors, the 



test was credited with 1 point ♦ If many of the items were judged 
inappropriate bec?.jse (1) they were ambiguous or misleading, they lacked 
demonstrably correct or ipcorrect alternatives, they were stated in 
unnecessarily complex language, or (2) they were personally offensive, 
inappropriate or offensive to special groups, too simple, or Intellectually, 
insulting in simplicity, the test was credited with 0 points. 

Are the items relevant and interesting for ad ult examinees? 

This rating was made somewhat independently of test content so that 
inherently interesting subject matter did not necessarily profit from 
this rating. One way to rephrase this rating would be to ask: Given the 
nature of the subject matter, have the items been made as relevant and 
interesting as they could be? the items were judged relevant and . 
interesting, the test was credited with ] lioint. If they were judged 
irrelevant or dull, the test wa$ credited with 0 points. 

Are test instructions oral or written ^ 

The issue here was whether successful performance on the test required 
only the behavior^ being measured by the test items or whether competency 
in reading test instructions was confounded with the behaviors purportedly 
being measured. If instructions were either completely oral or were- 
supposed to be. read aloud in addition to being .written out for examinees, 
the test received 1 point; if not, it-received 0 points. 

Are t est instructions appropriate and comprehensible ? 

The instructions, either read by or to the examinees, were inspected 
for appropriatenet^y of orientation, tone, syntax, and vocabulary. If the 
instructions exhibited appropriateness and comprehensibility on all counts, 
the test was credited with 1 point; if not, it was credited with 0 points. 

28 V 27 



^* Are the instructions comprehensive in their description of task aspects? 
The question addressed was whether the Instructions clearly and 
precisely described all aspects of the tasks the test measured, or left 
necessary Issues unanswered or unaddressed. If all aspects were described 
clearly and precisely, the test was credited with 1 point. If descriptions 
'•were unclear or Incomplete or left issues unanswered, the test was 
credited with 0 points. 

8» Do the te st in struct ions provide illustrative sample items ? 

If the instructions included sample items that effectively ^clarified 
and accurately illustrated the taskCs) involved in the test in such a 
way that they were truly representative of the format and difficulty of 
test items the test was credited with 1 point. If there were no sample 
items, or if samp^le items presented were not representative in format or 
difficulty, the test was credited with 0 points. 



Do^be test pages (or materials) exhibit good layout designed to facilit 



ate 



percept ion? 

Test layout was examined for effective use of perceptual organizers, 
s'ich as adequate white space, regularity of item fortn, symmetry, clarity, 
and continuity. If the test page layout was clear and helpful, the test 
was credited with 1 point. If the layout was unclear or confusing, the 
test was credited with 0 points. 

i. 'Is the physical appearance of the test of high quality ? 

For this rating, attention was given to the quality (bold, up-to-date) 
of the print and illustrations in printed tests, and the quality of sound 
in auditory or taped tests. If the quality was judged high, the test was 
credited with 1 point; if not, the test was credited with 0 points. 



ERIC 



28 



29 



i* Are oral Instructions .or items standardized ? 

To meet this criterion, tests with oral instructions or oral items 
(such as ''language potential" items) needed a standardized script designed 
to be read aloud, or, a recorded version, such as a cassette tape. If the 
test had one or the other, it was credited with 1 point; anything short 
of this — juch as merely suggesting topics to be mentioned to the examinee — 
was deemed insufficient and the test was credited with 0 points. 

k. Is there coherence between item stems and answers ? 

If item stems, their alternatives, and their answers appeared as a 
unit, in some way adjacent or "belonging to each other," the test was 
credited with 1 point. If the separate components of any item(s) appeared 
not to belong to each other, and therefore demanded untangling, the test 
was credited with 0 points. 

1. Are the time and pacing of the test appropriate? 

Tests purporting to be power tests either had to be untimed, or had 
,to furnish evidence that 90% or more of the validating group attempted 
all items. If a test met these conditions, or was appropriately paced, it 
w^^s credited with 1 point; if not, the test was credited with 0 points. 

m. What is the mode of examinee response ? 

No points were assigned for this information, but the test or subtest 
being evaluated was described as requiring oral (Or), written (Wr) or 
mixed (Mi) responses . 

n. Is there a simple and direct connection between the Item stem and tlie 
examinee *s recording of a response ? 

If the mode of responding was especially simple for the examinee, 
such as oral responses, or marking or writing directly on the test form. 



^ 30 



ERIC ^" , 29 



the test was credited with 2 points; if the test used standard separate 
answer sheets, it was credited with 1 point; if the test was complicated 
by the need for more than one step to get from item to answer, it wns - 
credited with 0 points. 

The Examinee Appropriateness ratings were summed for a total rating, 
varying from 0 to 15 points. These ratings were translated into letter grades 
of G (good, 12 to 15 points), F (fair, 8 to 11 points), and P (poor, 0 to 
7 points). 



Technical Excellence 

3« Does the test have alternative-form reliability ? 

The correlation between alternate forms of a test is the subject 
of this evaluation. If the appropriate coefficient was .90 or above, the 
test was credited with 3 points; if .80 to .90, the test was credited 
with 2 points; If .70 to .80, the test was credited with 1 point; and if' 
less than .70, the test was credited with 0 points. 

b. Does the test exhibit stability ? 

The consistency of scores over time spans of one month or more, as 
measured by test-retest reliability, is the subject of this criterion. 
If the appropriate coefficient was .90 or mor?, the test was credited with 
3 points; if .80 to .90, the test was credited with 2 points; if .70 to .80, 
the test was credited with 1 point; and if below .70, the test was credited 
with. 0 points. 

Does the test exhibit internal consistency ? 

The consistency of items or parts within a part as measured by 
split-half or Kuder-Rlchardson formulas was the focus of this criterion. 



ERIC 



30 



31 



If the appropriate coefficient was .90 or more, the test was credited 
with 2 points; if .80 to .90, the test was credited with 1 point; and 
If below .80 the test was credited with 0 points. 

d. Can the testing procedures be duplicated ? 

A test was deemed more desirable If the procedures for administration, 
scoring and Interpretation were sufficiently standardized so that 
procedures could be duplicated or replicated from the validating group. 
If the test provided uniformity of procedure for administration and scoring, 
the gross characteristics of the standardization group were repllcable, 
and the materials, time, limits (where applicable), oral instructions, and 
preliminary demonstrations were precisely delineated, the test was credited 
with 1 point; if not, the test was credited with 0 points. 

Xhe Technical Excellence ratings were ^mmed for a total rating, varying 

\ 

from 0 to 9 points. These ratings were translated into letter grades of 

G (good, 6 to 9 points), F (fair, 3 to 5 points), and P (poor, 0 to 2 points). 

Administrative Usability 

a. Who should administer the test? 

If regular program personnel, such as a teacher or aide, could read 
the instructions, establish rapport, and conduct the pacing, the test was 
credited with 1 point; if special personnel—such as a reading specialist- 
were required, the test was credited with C points. 

b. How long does it take to administer the test? 

If the test could be given in twenty minutes or less, including 
instructions, it was credited with 1 point; if not, the test was credited 
ylth 0 points. 



lERlC 
I 



32 



31 



This criterion focused on tliree aspects of the test manual: 
discussion of the purpose, uses and limitations of the test; clear 
administering and scoring directions; and description of test development 
and validation. If the manual*s discussion, directions, and descriptions 
were clear and complete, the test was credited with 1 point; if not, it 
was credited with 0 points. 

How many admlpisc rators or observers are needed to administer the test ? 
If not more than one administrator 6r observer ^as needed, the teyt 

was credited with 1 point; if more than one was needed, the test was 

V 

credited with 0 points. 

How easy and objective is the scoring? 

If the scoring was objective and simple, using a scoring guide, stencil, 
or template, or other straightforward process .such as answ<>r sheet or 
matching stencils, or if machine scoring was available, the test was 
credited with 2 points. If the scoring was objective but difficult, 
involving more than a stencil or template, such as scoring a passage written 
by the examinee for specified content, the test was credited with 1 point. 
If the scoring was subjective, requiring the scorer to make a non-trivial 
judgment, the test was cr^edited with 0 points. 

Who can interpret the test scores ? 

This rating e>tamined whether regular teaching staff could interpret 
the test. The answer to this question was either found in an explicit 
statement in the test manual, or else was implied from the common and 
simple conversion system for the scores. If the score could be interpreted 



33 

32 



by teaching staff, the test was credited with 1 point; If not, the test 
was credited with 0 points. 

How great is the range of complexity or difficulty of the test? 

Tests using some kind of grade equivalent scheme of reporting or 
organizing content, and having a spread of three yeais or more from the 
lowest to highest scored' obtainable, or from the easiest to most difficult 
materials, were judged to have a reasonably extensive range, and were 
credited with 1 point. For tests not using grade equivalent schemes. 
It the validating group had a spread of three years or i|ore on an external 
criterion task, or If the material In the test was organized around an 

c 

extensive hierarchy (or hierarchies) of tasks, the test was credited with 
1 point. Otherwise, the test was credited with 0 points. 

How diverse are the skills measured by the entire test? 

If the test had more than one separately reported, interpretable 
subtest, it was credited with 1 point for diversity. If not, it was 
credited with 0 points. Although this judgment made for the test as 
a whole, it is reported subtest by subtest. Thus, the I point for diversity 
reported under the "Oi^al Paragraph Reading^^ subtest of the Individualized 
Reading Placement Inventory refers to the entire Inventory^-^not to the 
^^Oral Paragraph Reading** subtest ver se. 

How clear and simple Is the process of converting the raw score to the 
Interpreted score ? 

If the score conversion procedure was simple, involving one easy-to- 
understand step — such as a clear chart or table — or If no conversion was 
necessary because the raw scores were Interpretable, the test was credited 
with 2 points. If the score conversion was complicated by lack of clear 



34 



33 



( 



or simple tables or graphs, or if it required two or more steps to get 
from the raw to the converted scores, (e.g., using one table to get into 
another table), the test was credited with 1 point. If the score 
conversion was necessary but complicated and lacked tables or graphs, 
required many or complicated steps (e.g., computing scores), or was not 
explicitly provided, the test was credited with 0 points. 

j« How interpretable are the scores? " 

This evaluation procedure looked for scores that were common and 
simple and could not readily be misunderstood or misused by program 
personnel. If the scores were pass/fail (or some other binary judgment), 
grade equivalents, percentiles, or meaningful raw scores (e.g., a words- 
per-minute reading rate or a precise report of letters for which the 
examinee could not give the sound), the test was credited with 1 point. 
If the scores were any other less common, novel, or ambiguous conversion, 
or conversion was lacking for raw scores not meaningful in themselves, 
the test was credited with 0 points. 

^' Are there scales of performance on real-life skills ? 

If the test included cuch scales (e.g., map reading, following 
directions, reading classified ads), it was credited with 1 point; if 
judgments on such skills were not included, the test was credited with 
0 points. 

^* Is the validating gro up representative of the national population of 
adults for whom the test was designed ? 

Five considerations were included in the evaluation of the representa- 
tiveness of the groups used to norm the test: (1) Was the sample obtained 
through cluster, stratified, or random rather than incidental sampling? 



[ERIC 



1 



35 



(2) Was the validating done less than five years ago? (3) Was there 
geographic representation? (4) Was the validating group composed of 
adults at the appropriate educational level (e.g., ^dult education, 
students or people of similar characteristics)? <5) Were various 
population density characteristics (e.g., urban, suburban, rural, etc.) 
represented? If the answers to four or five of these questions, based 
upon convincing tabulation for the third, fourth, and fifth ones, was 
"yes", the test was credited with 1 point. If there were fewer than four 
"yes" answers, the test" was credited with 0 points. 



Is racial, ethnic, and sex representation repor ted in the validition? 

If representation on more than one of these characteristics was 
reported, the test was credited with 2 points; if representation on only 
one characterization was reported, the test was credited with 1 point; 
if no representation was reported, the test was credited with 0 points. 

Are alternate forms available? 

If alternate forms, developed according to the same specifications 
to measure the same attributes were available, the test was credited with 
1 point; if not, the test was credited with 0 points. 

Are alternate forms comparable? 

Alternate forms of instruments can be comparable in many ways; there 
are considerations of content, approach or method, validities, similarity 
of descriptive statistics, and reliabilities. If available information 
indicated that the alternate forms were similar on these criteria, then 
the test was credited with 1 point. If a testes alternate forms 
exhibited low or no comparability, or the test had no alternate forms, it 
was credited with 0 points* 

36 



p. Can decisions be made ? 

This final aspect of administrative usability focused upon whether 
the test provided information useful in making decisions concerning < 
individual examinees. If the test manual established definite relationships 
between scores and specific decisions through the use of graphs, charts, 
cut-off scores, or other means which encouraged fairly specific decisions 
(e,g,, '*a score below this point means the examinee needs remediation to 
strengthen his word attack skills") » the test was credited with 2 points. 
If the test indicated interpretations of scores that could lead to 
specific decisions, or merely presented interpretations or definitions 
rather than decisions (e«g., "a high score indicates the need for testing 
with a standardized reading test for more accurate information"), the 
test was credited with 1 point. If the test provided vague or poor guide- 
lines, leading to highly intuitive, subjective judgments, or presented no 
information useful in making decisions, it was credited with 0 points. 

The Administrative Usability ratings were summed for a total rating, 
varying from 0 to 19 points. These ratings were translated into letter grades 
of G (good, 16 to 19 points), F (fair, 12 to 15 points), and P (poor, 0 to 
11 points). 



ERiC 



37 

as 



TEST REVIEWS 

Descriptive Inforaatlon on the Individual tests Is Included in the 
following test reviews- The reviews are organized into thre^ general 
categories similar to those suggested by Otto:^^ criterion-referenced 
functional literacy tests, standardized tests, and informal tests. Criterion- 
referenced functional literacy tests measure an examinee *s performance on 
real-life skills (e.g., reading maps, reading bills and applications) against 
a predetermined standard of acceptable performance. Such tests Intend to 
provide information which is very task-oriented and immediately relevant to 
the examinee's everyday activities. 

Standardized tests measure an examinee ^s perfoimance relative to the 
perfdrmance of others who have taken the test. Although these tests may use 
functional literacy tasks for content, they typically measure such traditional 
reading behaviors as vocabulary, comprehension of a reading passage, or spelling. 

Informal tests may be designed to provide information about an examinee's 
general reading level, or about more specific reading abilities, such as 
letter or word recognition. They are often individually administered and seek 
to convey to the examinee a feeling of informality meant to reduce anxiety 
in the testing situation. Usually the directions for administering, j^rlng, 
and interpreting such tests are very short- and suggestive, if present at all. 

Within each of the three categories, the test reviews are arranged alpha- 
betically by test name. All entries follow a standard format, as outlined below. 

Test Name 

Publisher : The name and address of the firm or Individual making 

the test available are given here. 



ERIC 



37 



41 



Descript ion ; 



This section indicates what the test Is intended to 
measure, descri'bes any subscales included In the test, 
and notes available alternate levels. 



Availability of This section describes what alternate (parallel) forms 
Alternate Forms ; 

are available to test users. 



Admin is t rat ion This section indicates the time necessary for the examinee 
Time ; 

to take the test, including the time spent In receiving 
init ial Instruct ions • 



Administrat ion 
Procedures: 



This section Indicates whether the test Is administered 
to individuals or to groups. It further details the 
activities of the examiner and the examinees during the 
test administration* 



Materials Used: 



Materials needed by the examiner and also those needed 
by the examinee are listed here. 



Scoring 
Procedures: 



Procedures for scoring the test are described In this 
section. 



Interpretation 
Procedures : 



This section notes what Interpretable scores the test 
provides, and specifies tho types of conclusions that 
can b*=i drawn or decisions that can be made on the basis of 
test results. 



Validity 



This section presents the evidence for validity offered 
by the test developer. 



ERIC 



Reliability : 



42 



This section presents the evidence for reliability 
provided by the test developer. 

38 



Field Tryouts ; This section describes the nature of field tryouts 

conducted with the test. The characteristics of the 
tryout population are included 'if the test developers 
reported thera. 

Ratings ; This section specifies the pages on which evaluations 

relating to the test may be found. 



ERLC 



39 



43 



CRITERION-REFERENCED FUNCTIONAL LITERACY TESTS 

Adult Performance Level Functional Literacy Test (APL) 

Publisher ; Dr. Norvell Northcutt , 

Division of Extension 
103 Extension Building 
University of Tessas at Austin 
Austin, Texas 73712 

Description ; The APL Is a test of functional literacy for adults. There 

are 42 Items, many of which involve more than one que/itlon. 
The Items test an examinee's knowledge of consumer economics, 
law and health; his ability to perform real-life tasks; as 
well as his reading and writing ability. 

Availability of - There are no alternate forms available. 
Alternate Forms: 



Administration The test takes approximately 60 minutes to administer. 
Time: 



Admin 1st rat ion The test is individually administered in an interview 
Procedures : 

format. The examiner reads the questions aloud while 
the examinee follows along in his booklet. The examinee 
then responds, either by reading oraLly or calling out 
the correct answer from several choltes. The examiner 
records the answer given and goes on. If the examinee is 
asked to do a task requiring writing (filling out a check, 
addressing a letter), the examiner gives the examinee the 
questionnaire in which to write his response. Thus all 
answers are recorded in the questionnaire. 



1 

o 40 

ERIC 



WaterialsUser^ J 



Examiner: Questionnaire, pencil. 
Examinee. Bool let, pencil, uialer. 



Scoring 
Procedures; 



Interpretation 
Procedures: 



Va lidity ; 



Reliability : 



Field Tryoutsi 



The test Is scored in two ways. Multiple choice Items 
are scored by comparing the examinee's answer to the 
correct 'aftswer Indicated in the questionnaire. Questions 
in which the examinee engages In a written task are scored 
according to a ^system of ^ules^ .glven In the handbook, 
indicating what answers are acceptable and what are not* 

For purposes of Initial anal: sis, scores' are grouped Into 
quartlles according to the number of points achieved on 
the test. Th6y are interpreted primarily, however, according 
to three APL levels; APL 1, (least competent), APL 2 
(marginally competent) and APL 3 (most competent). 

Validity consists of research showing the relationship of 
items, groups of items, and lavels of competence to various 
criteria such as income, education level, and job status. 
There was also a technical review conducted by experts, and 
several cycles of field testing and redesigning. These 
data are too extensive to . summarize here. 

Item difficulty levels comparing earlier surveys and the 

c 

final survey are provided as a measure of reliability. 

The field tryouts were conducted on a random sampling of 
geographically stratifieu counties. Three-hundred sixty 
counties were chosen and divided into 6 blocks. Each block 
was an independent subsampl^ representing the continc^ntal 



41 



U, S. A starting point within ^ach county was randomly 
chosen, and interviewers visited individual residences and 
administered the test. The weighted sample compared very 
closely with the universe in sex, age, education, urban 
distribution, geographical distribution, family Income, 
and race. " 

See pages 100-101. 



42 ^7 



Basic Reading Skills Mastery Test 



Publisher: 



Services for Educational Evaluation, Inc. 
?.0. Box 261 

Bloomington, Indiana 47401 



De script ion ; 



This test is an objective measure of comprehension in \ 
functional reading. The test consists of four scored 
subscales: Following Directions, Locating References^ 
Gaining Information, and Understanding Forms. There is 
also a non-scored subscale designed to indicate the examinee's 
attitudes and habits in reading for personal development. 
Three levels of the test are available: Level A for 12 year 
olds. Level B for 15 year olds, and Level C for 18 year olds. 
Level C is used for adults. « 



Availability of There are no alternate forms available. 
Alternate Forms: 



Admin is t rat ion 
Time : 



Two 50-minute administrations are required for the test. 
All students are to be given time to finish the test. 



Administrat ion 
Procedures: 



The test is group administered. The examiner provides 
testing materials and reads instructions to the students. 

The examinee reads passages or forms and answers comprehension 

* 

questions on an answer sheet. 



Materials Used: 



Scoring 
Procedures: 



Examinei?: Examiner's manual, test booklet;^ 

Examinee; Test booklet, pencil, eraser, answer sheet. 

The answer sheets are computer scored, and the results 
returned on a print-out sheet. 



ERLC 



48 



43 



Interpretation 
Procedures: 



Validity : 



Reliability : 



Field Tryouts; 



Ratings; 



Eighty percent correct or better is considered mastery on 
this test. 

i 

r 

. Content validity was based on the conclusions of a 
committee of reading specialists regarding functional 
reading skills. There were also student reviews of the 

-items, experts' reviews, and field tryouts^ 

The K-R 20 yielded an estimate of internal consistency of 
.98 for the total test. For the four subscales the K-R 20 
values were .87, .91, .93, and .93. 

A sample of 2700 Maryland students, including minority 
groups, representing urban, suburban, and rural areas 
throughout the state, was used to test the three levels 
of this test. 

See pages 100-101. 



ERIC 



49 



44 



Reading/Everyday Activities in Life (R/EAL) 



Publisher; 



Description : 



CAL Pressi Ine. 

76 Madison Avenue 

New York, New Yo 10016 

The test i*" n objective assessment of functional literacy 
presented in nine selected categories of common printed 
materials encountered in daily living. English and 
Spanish versions are available. 



Availability of There are no alternate forms available. 
Alternate Forms: 



Admin is t rat ion 
Time: 



Administration 
Procedures: 



Materials Used:^ 



Scoring 
Procedures? 



The test requires approximately 20-30 minutes; an examinee 
works at his own pace. 



The test may be individually or group administered. The 
examiner provides testing materials (i.e., test answer 
booklet and cassette tape recorder with R/EAL cassette). 
The examinee listens to taped questions which correspond 
to material in the test booklet and records answers in 
the test booklet. 

Examiner: Examiner's manual. 

Examinee: Test booklet, cassette recorder with R/EAL 
cassette tape, pencil, eraser. 

Scoring is done by hand, referring to pre-established 
correct responses. Raw scores are totaled for the nine 
categories and the total raw score is then converted to 
y»e"centage of items passed. 



50 



45 



Interpretat ion 
Procedures: 



Validity ; 



Reliability: 



Field Tryouts ; 



/ 

Criterion-referenced - Test items are directly related to 
sets of objectives associated with each of the nine r-eading 
activities. Functional literacy is defined as passing 
80% or more of the test items (or achieving a raw score 
of 36)., 

Interpretation of Individual Subtests - Following a review 
of the examinee's performance on individual subtests, the 
interpreter can recommend prescriptive programs to meet 
areas of need indicated through detailed task analyses 
outlined for each subtest, 

Criterioft-related validity was investigated by computing 
the correlation between this test and the Stanford 
Achievement Test; tJ^e correlation between the two tests ^ 
was ,74 (n«434). Content validity relies on the selection 
of questions from the \ task analyses which specified test 
objectives, \ 

The internal (inter-item) consistency estimate of 
reliability, based on K-R 20, was r-.93; the target group 
for the reliability sample included a specified sex \ 
distribution, and a m'^jority of minority individuals who 
had completed an average of nine years of school and who 
had a reading grade equivalent of 5.2 on the Stanford 
Achievement Test. No breakdown was provided, however, for 
sex, or individual minority representation. 

The testing manual indicates the subjects used to 
standardize the test included 169 males and 265 females. 



ERLC 



46 



51 



aged 16--21* -TTie^su¥Jects, were all low income Individuals 
and ia majority of them were Blacks, Spanish-sumamed or 
rural whites. Subjects had completed an average of nine 
years of school and had an average reading equivalent of 
5.2 on the Stanford Achievement Test. 

See pages 100-101. 



47 



Wisconsin Test of Adult Basic Education (WITABE) 



Publisher: 



Rural Family Development Program 
University Extension 
University of Wisconsin 
Madison, Wisconsin 53706 



Description; 



This test was especially designed to monitor the basic 
skills achievement of persons enrolled In the Wisconsin Rural 
Family Development Program. The test appears appropriate 
for general use with adults who read below high school level. 



Availability of There are no alternate forms available. 
Altema te Forms : 



Administration 
Time: 



The test is generally untlmed; however, the maximum 
administration time for the two reading sections combined 
should be less than one hour. 



Administration 
Procedures: 



The testing conditions are very flexible. The examinee 
works at his or her own pace; the examiner *s only responsibility 
is to ensure that the written instructions are understood* 
The test may be administered individually or to groups. The 
WITABE consists of verbal and coping skills sections, both 
of which might loosely be considered "reading'^ tests* The 
skills required to complete the coping skills subtest o 
Include using a road map, ordering by mall, filling out a 
tax return, using a phone book, and a variety of comparable 
tasks. A numerical subtest is also part of the WITABE. Any 
of the sections may be given separately. 



ERLC 



Materials Used: 



Examiner: Test booklet. 

Examinee: Test booklet, pencils, eraser. 

48 



53 



Scoring 
Procedures: 



Scoring is done by hand; responses are compared with 
pre-established correct answers. A few questions In the 
coping skills subtest have more than one point scoring 
but assignment af points is still objective and relatively 
simple. The raw score obtained Is not converted. 



Interpretation 
Procedures: 



The WITABE was developed to measure differences between 
treatment groups and control groups in the Wisconsin 
program. Raw scores were adequate for this purpose and 
thus no score interpretation process exists. Test scores 
cannot at this time be converted Into grade equivalents, 
percentiles or other norm-comparisons; nor Is any criterion- 
referenced diagnostic Information given. 



Validity: 



Without giving numerical Information, the authors state 
that the test data Item analysis conducted by the University 
Psychometric Laboratory, which Involved field test results 
from 120 rural Wisconsin 6th, 7th and 8th graders, led to 
rejection of unsuitable Items. The modified Instrument was 
administered to 37 adults to determine the psychometric 
quality of. the Items. 



Reliability : 



The authors report that the Hoyt reliability Index for the 
20-ltem verbal subtest was .90. The reliability for the 
29 Item coping skills subtest was also reported as ,90. 



Field Tryouts : 



The WITABE has been used by the Wisconsin Rural Family 
Development Program with the 120 public school students and 
37 adults mentioned above, and with treatmnt and coatrol 



ERLC 



54 



49 



groups chosen for Rural Family Development Program evaluation. 
The makeup of the latter two groups was specified by age, 
sex, and geographic location; however, no scoring or 
norming data was provided. 

See paKes 100-101. 



STANDARDIZED TESTS 



Adult Basic Learning Examination (ABLE), Level I 



Publisher: 



Har court Brace Jovanovlch, Inc# 

757 Third Avenue 

New York, New York 10017 



Description ; 



The test Is designed to determine the general educational 
level of adults. It consists of three levels: Level I 
(Grades 1-4), Level II (Grades 5-8), and Level III (Grades 
9-12). Each level Includes vocabulary, reading, spelling 
and arithmetic tests. (The arithmetic test was not reviewed.) 



Availability of Alternate forms A and B are available. 
Alternate Forms: 



Administration 
Time: 



Estimated times for administration of the subtests are: 
vocabulary, 20 minutes; reading, 30 minutes; spelling, 
15 minutes. 



Administration 
Procedures: 



The ABLE handbook recommends group administration. However, 
this test could be Individually administered as well. The 
vocabulary and spelling tests are dictated to the examinee, 
who Indicates his answers by shading In an oval In his test 
booklet under his word choice. The vocabulary section 
requires sentence completion; three word choices are given. 
Examinees complete the reading section Independently, 
choosing the correct word to complete a thought. 



ERIC 



Materials Used: 



Examiner: Test handbook, scoring key and group scoring 
record. 

Examinee: Test booklet and pencil. 



57 



51 



Scoring 
Procedures: 



Interpretation 
Procedures: 



Validity ; 



Reliability : 



Field Tryouts : 



A key is provided in the packet for hafid scoring, but 
scoring can be done by machine* 

The number of items right for each test can be interpreted 
in terms of grade level equivalent. Grade level equivalents 
are the only conversion provided. The test developers 
also suggest that users develop local norms ♦ 

Concurrent validity studies are reported, based on test 
administration of the ABLE and the Stanford Achievement 
Test (SAT) to a school group within a week's time. 
Correlations among appropriate scales ranged from .60 to 
.76. In addition, correlations were computed between the 
SAT Paragraph Meaning Scale and ABLE for a Job Corps 
group. These correlations ranged from .36 to .72. 

Split-half (odd-even) reliability coefficients adjusted by 
the Spearman-Brown formula are reported for grade 3 of the 
school group (.87 for vocabulary, .93 for reading, .95 for 
spelling), grade 4 of the school group (.89 for vocabulary, 
.93 for reading, .95 for spelling), the Job Corps group 
(.85 for vocabulary, .96 for reading, .96 for spelling), and 
a group of adult basic education students (.91 for vocabulary, 
.98 for reading, .94 for spelling). 

ABLE was administered to three groups: 1) elementary and 
junior high school students, 2) Job Corps members, and 
3) Hartford-New Haven adult students. The school group 
consisted of 1,000 pupils per grade (grades 2-7) from four 
school systems in four states. The Job Corps group consisted 



52 



of approximately 800 young men in both urban and 
conservation centers. The Hartford-New Haven group 
consisted of approximately A50 adults enrolled in basic 
education classes in those two cities. Statistics on 
ethnic composition and educational level are displayed 
in the test handbook. 

Reading, see pages 102-103. 
Spelling, see pages 102-103. 
Vocabulary, see pages 102-103. 



59 

53 



Basic Occupational Literacy Test (BOLT), Fundamental Level 



Publisher;" 



U. S. Department of Labor 



Description: 



Availability of 
Alternate Forms j 



The test Is designed to measure the basic reading and 
arithmetic skills of educationally disadvantaged adults* 
There are four subtests: reading vocabulary, reading- 
comprehension,- arithmetic computation, and arithmetic 
reasoning. Each test is available at four difficulty 
levels. 



Three alternate forms are available for the first three 
levels. The advanced level offers two forms for each 

.subtest. 

'J, 



Administration Fifteen minutes for each Subtest, 



Time I 



Administration 
Procedures: 



Before administering the subtests, each examinee is given 
the Wide Range Scale (included with the lest) to determine 
the appropriate level of BOLT\to administer. Directions 
are given orally to individuals or small groups. Each y 
examinee records his answers on an answer sheet by marking 
the appropriate circle. 



Materials Used : Examiner: Manual, scoring key, stop watch, test record 

cards. 

Examinee: Test booklet, answer sheet, pencil, paper clips, 
scratch paper. 



ERIC 



60 



51 



Scoring 
Procedures; 



Scoring can be done either by hand or. by machine. Hand 
scoring is done by placing a stencil over Lhe answer 
sheet and counting the number ot visible marks. The 
total number of correct responses can then be converted 
to a standard score or General Evaluational Development 
(QED) level using conversion tables contained in the 
User's Manual. 



^atlon 



Procedures: 



Val idity : 



Reliability: 



'ield Tryouts ; 



Once scoresSare converted to '^ED levels they can be 
compared to the GED levels for occupations listed in 
the Dictionary of Occupational Titles. One must be familiar 
with GED scores as well as standard scores in order to 
interpret scores for the BOLT. 



To establish content validity, directions, test items, / ' 
and time limits were given a preliminary tryout. Following 
revision, extensive field t'^sting was^conducted and an 
intricate set of item analysis rules led to development of 
the final forms. Construct validity research was conducted 
to answer general questions about testing disadvantaged adults. 

Internal consistency of the subtests was judged according 

to K-R 20, and computed for each subtest as it was administered 

to *ach subgroup. ' K-R 20 coefficients for the final forms 

(fundamental level) were: vocabulary*^ 79 (form A) and 

.80 (B), and comprehension .77 (A) and .76 (B). 

A preliminary tryout was conducted on A53 persons • The 
sample, from 10 states, was stratified by geographic r^ea, 



ERLC 



61 



55 



sex, age, education, and minoilty group status. A 
similarly stratified sample of more than 8,000 subjects 
from 33 states took part in the major field testing of 
11 experimental forms of the reading tests. Some 
1600 of these subjects were given various forms of what 
became the fundamental level tests. Extensive breakdowns- 
of subjects by geographic area and by minority group 
status were presented. 

Ratings ; Comprehension, see pages 102-103. 

Vocabulary, see pages 102-103. 



62 



56 



\ General Educational Performance Index (GEPI) 

Publisher ; Steck-Vaughn Company 

807 Brazos 
P.O. Box 208 
Ajistin, Texas 78767 

1> Description; This test of high sghO^l equivalency was designed 

to predict success on the General Educational Develop- 
ment Test. Although it is divided into five subscales, 
this evaluation is concerned only with tests 1 
(Correctness and Effectiveness of Expression) and 2 
(Literature Interpretation). 



Availability of Alternate forms A and B are available. 
Alternate Forms; 



Administration 
Time; 



Administration 
Procedures; 



Materials Used: 



Scoring 
Procedure; 



Although the test is untimed, it is estimated that 
tests 1 and 2 each require fr^m 20 to 40 minutes. 

The test is group administered. ^ The examiner dis- 
tributes test booklets and answer sheets; the examinee 
reads instructions for each subscale, then .ds the 
itCims and records his answers. 

Examiner; Copy of test booklet and the Manual of Directions. 
Examinee; Copy of test booklet, a score sheet, a pencil, 
an eraser. 

Scoring the test is a simple, objective process. The 

examiner uses a template to mark the examinee's answer 

sheet, then counts the number of correct answers per subscale. 



ERIC 



57 



63 



Interpretation 
Procedures: 



Validity : 



Reliability ; 



Field Tryouts: 



This raw score is then converted to a standard score 

usin^; a table in the manual. 

llie standard scores in the GEPI subscales should give 
the examinee and examiner an idea of the examinee's 
readiness for the GED. There are tables comparing the 
standard GEPI scores and GED scores. The GED is a 
pass/fall test, with a specified cut-off standard score 
(40 in some states). The GEPI subscales can also be 
used to isolate areas of weakness, to determine group- 
ings for instruction, or, (in a re-test situation) 
to see if the examinee has made progress. 

Content validity consists of the authors' statements 
that the test was prepared and reviewed by experts who 
ha^d thoroughly researched the field. Also, the literacy 
related scales of the GEPI correlate with the appropriate 
GED scales in the range of ^62 to .70.^ 

The alternate form reliability coefficients were ,73 
for Correctness and Effectiveness of Expression and •68 
for Literature Interpretation* 

Field tryouts were conducted in 1974 and 181 adult 
students randomly chosen from all parts of Texas, and 
enrolled in a variety of GED programs* Data were 
provided on sex, racial and age composition of the 
group. 



64 



58 



Ratings; Correctness and Effectiveness of Expression, see pages 

100-101. 

Literary Interpretation, see pages 100-101. 



\ 



59 



65 



SRA Reading Index 



Publisher: 



Science Research Associates, Inc. 
259 Erie Street 
Chicago, Illlnpls 60611 



Description : 



Thi6 test has five parts. Part (Level) I, Picture- 
Word Association, tests the student's ability to 
associate a word with a picture of an object. Part 
(Level) II, Word Decoding, tests a student's ability 
to choose the right word to complete a sentence. 
.Part (Level) III, Phrase Comprehension, requires 
the student to choose the appropriate word or 
phrase to complete a sentence. Part (Level) IV, 
Sentence Comprehension, requires the student to 
choose an accurate paraphrase of a given sentence* 
Part (Level) V, Paragraph Comprehension, has the 
student read a paragraph and answer comprehension 
questions* 



Availability of 
Alternate Forms: 



There are no alternate forms available* 



Administration 
Time: 



The test Xan be given in one 25-minute timed session, 
but timing is not required* 



Administration 
Procedures : 



The test is group administered* The examiner first 
reads instructions orally to the group; examinees then 
read the questions and mark the appropriate answers. 



60 



ERIC 



Materials Used; Examiner: Examiner' s manual, test booklet. 

Examinee: Test booklet, two lead pencils. 

Scoring The test booklets a-e self -scoring; a student's ' 

Procedures ; 

marks are transferred through carbon paper to a 
key. The examiner counts the correct responses 
(those wirhln boxes on the key) and records them 
for each part (level). He then records this 
number on the cover of the booklet. 

Interpretation Two sets of norms are given for the test; special 
Procedures : 

education norms and industrial norms. Also, the test 
booklet includes a chart indicating the number of 
correct items neeeed to pass a given level. This 
number is based on an 80 percent proficiency criterion. 
The Examiner's Manual discusses use of these scores 
in relation to job analysis and minimum proficiency nfeeds 
for certain jobs. 

Validity: Content validity relies on the method of choosing items 

for the test. A pool of items was developed for 
the test, and then screened by the language department 

r 

of a Job Corps center for appropriateness and 
ambiguity. A concurrent validity study was conducted \ 
in which the Reading Ir.dex scores were correlated 
with overall job ranking of people in twenty-one 
occupations. The largest coefficient was .32. and ten 
coefficients were significant at the .05 level. Also, 
correlations between the Reading Index and the Flanagan 

67 

61 



Industrial test for each of the occupation groups 

are shovm in the^^^anual. 

\ 

Reliability ; The K-R 20 reliabi^llty coefficient was .87. The 

Raju-"Guttman HomogWelty Index was .93. The group 
tested consisted ot 87 men and women from a combination 
on-the-job training and basic education program in 
Chicago. 

Field Tryouts : This test was pre-tested on a total of 675 males and , 

females enrolled in special- and adult-education programs 
in Colorado and South Carolina. It was given to a group 
of 87 men and women in a combination on-the-job training 
and basic education program in Chicago in order to 
establish special education norms. Also, the test was 
given to 3274 workers to establish industrial norms for 
whites and nonwhites. 

Ratings ; See pages 100-101. 



68 



62 



Tests of Adult Basic Education (TABF), Level E 

Publisher ; CTB/McGraw-Hlll 

Del Monte Research Park 
Monterey, California 939A0 

Description : The test provides a system for measuring the reading 

achievement level of adults, based upon a corresponding 
level of the California Achievement Tests. Level E 
(Easy) is intended for adults with severe educational 
limitations or for those from culturally disadvantaged 
backgrounds. It is intended for the "upper primary" 
levels, or Grade 2 to beginning Grade A level. Level M 
(Medium) is adapted from the elementary level of the 
CAT, and Level D (Difficult) is adapted from the junior 
high school level of the CAT. 



Availability of Alternate forms ] and 2 are available. 
Alternate Forms: 



Administration 
Time: 



The Reading Vocabulary section takes 9 minutes; Reading 
Comprehension takes 31 minutes (total time - 40 minutes), 
It Is permissible to provide a break or rest period 
after any of the test sections. 



Administration 
Procedures: 



The test is group administered. The examiner reads 
general test directions to students before each section, 
and then reads section directions, which are also 
printed in the test booklet. Examinees record their 
answers in test booklets. Each test section has an 



ERIC 



63 



69 



established time limit. A set of practice exercises 
and a locator test are "available; these are designed 
specifically for pre-testlng. The practice exercises 
familiarize examinee's with the mechanics of the test, 
while the locator test, a short vocabulary test, 
provides a basis for determining the level of TABE 
best suited for a particular Individual. 

Materials Used_ ; Examiner: Examiner ^s manual, blackboard, stopwatch. 

Examinee: TABE test booklet, pencil, eraser. 



SCO r ing 
Procedures: 



Interpretation 
Procedures: 



Scoring consists of matching a hand-scoring key to 
corresponding pages In the examinee's test booklet. 
The score for each test Is the number right; this is 
recorded on the bottom right hand comer of the last 
page of each section in the test booklet. The total 
right for each section is transferred to an appropriate 
box on the Profile Sheet* Total Section raw scores 
are added together to obtain the total raw score for 
the test. Total raw scores are then plotted according 
to a grade placement level on the Profile Sheet. 

In addition to the grade-placement level, the Profile 
Sheet provides an "Analysis of Learning Difficulties." 
The analysis is completed by recording a student *s 
errors in each section; the items in each section are 
listed according to skill areas. The resulting learning 



ERLC 



70 



61 



profile becomes a basis for planning remedial or 
developmental work and individualized instruction 
needed by the student. 



Validity ; 



Claims for content validity are based u|.on item 
selection procedures for the California Achievement 
Test, from which the test has been adapted* 



Reliability ; 



No information is available on '■pliability. 



Field Tryouts ; No 4.nformation is available on field tryouts. 



Ratings; 



Comprehension, see -pages 102-103* 
Vocabulary, see pages 102-103* 



ERIC 



65 



71 



INFORMAL TESTS 



Adult Basic Reading Inventory 



Publisher: 



Scholastic Testing Service 
480 Meyer 

Bensenville, Illinois 60106 



Description ; 



This test has five parts. Part I tests the student's 
ability to associate a word with a picture. Part II 
tests the student's sound and letter discrimination. 
Part III tests the student's ability to associate 
synonyms (or related words) as he reads the words. 
Part IV is similar to Part III, except that the 
student hears the words read orally. Part V requires 
the student to read paragraphs and answer comprehension 
questions. 



Availability of 
Alternate Forms: 



There. are no alternate forms available. 



Administration 
Time: 



The test can be administered in one session; Parts I 
and II each require 5 minutes. Parts III and IV each 
require 10 minutes* Part V requires 15 minutes. 



Administration 
Procedures; 



The test is group administered. In Part I, the examiner 
reads instructions and examinees underline words 
associated with pictures. In Part II the examiner 
reads words to the exauninees, and examinees underline 
words beginning with the same sound as the word rea«l 
by the examiner. In Part III, the examiner reads 



ERLC 



66 



73 



Materials Used; 



instructions, and examinees underline the word in a 
list which has about the same meaning as a word 
written to the side.' In Part IV, the examinee per- 
forms ^he same task; however, the words are read 
orally by the examiner. In Part V, the examiner 
reads the instructions, and examinees read paragraphs 
and choose the correct answer to comprehension 
questions. 

Examiner: Manual of directions. 

Examinee: Test booklet, line marker, two colored pencils, 
eraser. 



Scoring 
Procedures: 



Scoring is objective and simple. The examiner slinply 
compares the student's answers in the test booklet to 
a scoring key. For each part he indicates the number 
of correct answers. Each raw score is then converted 
to a percentage score according to instructions 
provided In the Manual. 



ERLC 



Interpretation 
Procedures: 



Validity: 



74 



The Manual indicates how to assess an examinee's read- 
ing ability by approximate grade levels, or in terms 
of functional or bsolute illiteracy as defined in the 
Manual. It also offers some general suggestions on 
assessing areas of weakness and aspects of remediation. 

Concurrent validity studies have been done with the 
Gates Advanced Primary Reading Test and with teacher 
ratings of student abilities from pre-primer to fifth 



67 



Reliability ; 



F ield Tryouts ; 



Ratings; 



grade using a 9-point scale. Correlations with the 
Gates test ranged from .82 to •88 a.nd with the 
teacher estimates from .67 to .76* 

Reliability studies were conducted on 38 adults in 

an adult literary project in an urban area of Northern 

Illinois. The K-R 21 coefficient was .98. 

Small scale tryouts of the test were conducted with 
38 adults and 17 juvenile male retarded readers. 
The adults were involved in an adult literacy project 
in an urban area of Northern Illinois. No sex, ethnic, 
or racial breakdown was given. 

Part I, Sight Words, see pages 104-105. 

Part II, Sound and Letter Discrimination, see pages 104-105. 
Part III, Word Meaning (Reading), see pages 106-107. 
Part IV, Wotd Meaning (Listening), see pages 106-107. 
Part V, Context Reading, see pages 104-105. 



ERiC 



75 



68 



V 



Publisher: 



Description ; 



Availability of 
Alternate Forms; 



Administration 
Time: 



Administration 
Procedures: 



Cyzyk Pre-Readlng Inventory 

Janet L. Cyzyk (Author). 

Adult Reading Specialist 

Baltimore County Board of Education 

6901 N. Charles Street 

Towson, Maryland 21204 

The Inventory consists o.* /arlous activities designed 
to help a tocher recognl'ze ,def Iclences within dls-* 
criminatory -and perceptual skills in the visual, audi- 
tory, anJ perceptual m^tor areas that must be dealt 
with before an adult non-reader can begin learning to 
read. 

There are no alternate forms available. 

The'rev are nine separate short sections to ,the test. 
Examinees may be given any number In a single session. 
The test are untlmed; no cjstlmate Is given of the 
testing time required. 

The Inventory may be Individually or group administered, 
Each exaiSlnee receives, a test booklet In which to 
underline the correct answers. Instructions are given 
orally. by the examiner. Examinees do some of the 
activities Independently, and In the remaining activi- 
ties respond to lists of words read by the examiner. 



Materials Used: 



Examiner. Test directions. 
Examinee: Test booklet, pencil. 



76 



69 



Scoriryj 
Procedures ; 



Interpretation 
Procedures: 



Validity ; 
Reliability : 
Field Tryouts ; 
Ratint^s ; 



The test is hand-scored by the examiner who determines 
the adequacy of each response. In its present form 
it serves only to provide diagnostic information to 
the teacher who seeks, through personal evaluation of 
test results, to identify students' deficiencies. 

The test activities measure examinee abilities in 
motor skills, reading functional words, perception 
of letter forms, order and sequence of letters and 
digits, handwritin'g speed, auditory discrimination. ' 
word perception and word discrimination. Poor 
examinee performance on any of the sections suggests 
that the teacher shoulf conduct additional testing 
on an individual basis. 

Nc iiformation is available on validity. 

No information is available on reliability 

No information is available on field tryouts. 

See ages 106-107. 



ERIC 



70 



77 



I 



Harris Graded Word List and 
the Informal Textbook Test 



Publisher: 



Adult Continuing Education Resource Centner 

Montclair State College 

Upper Montclair, New Jersey 07043 



Description ; 



These two tests are used together. The Harris Graded 
Word List cons^ts of seven lists of words representa- 
tive of varying reading levels. The Informal Textbook 
Test, given to applicants who score above grade level ' 
2.0, involves a series of seven passages (at reading 
levels 2-8), e^ch followed by a list of comprehension 
questions. 



Availability of There are no alternate forms available. 
Alternate Forms: 



Administration 
Time: 



The Harris Graded Word List requires only one minute 
for each examinee. The administration time for the 
Informal Textbook Test (group administered) is not 
known. 



Administration 
Procedures: 



The Harris Test is individually administered. The 
examiner has the examinee read each list of words, 
noting mentally the level at which he makes 3 or 4 
errors. This level is later entered on the registration 
form. Examinees who score above 2.0 reading level take 
the group administered Informal Textbook Test. The 
examinee reads seven passages and answers the compre- 
hension questions in the booklet. 



ERIC 



78 



71 



Materials Used : 



Examiner: 
Examinee: 



Harris Graded Word List, pencil. 
Informal Textbook Test booklet, pencil, 
eraser. 



Scoring 
Procedures J 



Harris Graded Word List: The examiner mentally 
notes at which level the examinee makes 3 or A 
errors in reading words. Informal Textbook Test: 
The examiner compares the examinee's responses ^ 
with pre-established correct respjMrges. 



Interpretation 
Procedures: 



Validity : 



Reliability : 
Field Tryouts ; 



Harris Grade Word List: If the examinee does not 
read above 2.0 reading level, he is classified as 
a beginning reader. Informal Textbook Test: The 
examinee's instructional level is determined by 
noting at which reading level he scores 2-3 (out of 
a possible 4). Any score below 2 indicates he 
should be in a beginning group. 

Validity consists of the author's statement that 
the Harris Graded Word List is "scientifically 
organized," and that the standards for reading levels 
are based upon the Dale-Chall Formula and ratings 
given in a combined word list by Buckingham and Dolch. 

No information is available on reliability. 

No information is available on field tryouts. 



ERiC 



79 



>2 



Ratings ; 



Harris, see pages 106-107 • 
Informal, see pages 104-105. 



( 

V 



80 



73 



Idaho State Penitentiary Informal Reading 
Inventory 



Publisher: 



The Reading Education Center 
Boise State University 
Boise, Idaho 83720 



Description : 



Availability of 
Alternate Forms: 



Administration 
Time: 



The Inventory Is designed to provide a reading 
teacher with a student's estimated Independent 
reading level, estimated Instructional level, 
estlmated^rustratlon level, estimated listening 
level, specific word recognition deficiencies, and 
specific comprehension def iciencieSt The test is 
applicable specifically to penal adult populations, 
and particularly to those persons who have difficulty 
learning to read. 

Alternate forms A and B are available. Each is 
divided into two major sections. Word Lists and Stories. 
The two forms are bound in one booklet to facilitate 
repeated administration. 

The word lists require approximately 10 minutes. 
Each of the eight stories (corresponding to grade 
levels in difficulty) takes 5-10 minutes to read 
aloud. The estimated tine for administration of 
comprehension tests following each story is five 
.Tiinutes per story. All of the stories need not be 
administered at one sitting* 



ERIC 



74 



81 



Administration 
Procedures: 



Materials Used: 



Scoring 
Procedures: 



The test Is Individually administered by a reading 
teacher. The examinee reads words selected from each 
of the stories aloud while the examiner codes errors 
on a copy of the word lists, beginning with the first 
grade level story. The examinee continues pronouncing 
words until three words within one list have been 
missed. For the oral stotles, the examinee reads 
each story aloud while the examiner codes errors. The 
coding procedures suggested are somewhat complex and 
not standardized. After the examinee has finished 
the oral reading, the examiner asks comprehension 
questions on each of the stories, recording correct 
and incorrect responses. 

Examiner: Pencil, teacher's copy of Student Word 

List and Student Stories, recapitulation 
sheet, manual of directions* 

Examinee: Student's copy of Word List and Stories. 

Scoring consists of a complex and highly detailed 
system of coding to note student errors in oral 
reading. Scoring of comprehension questions is done 
using a guide for acceptable answers. Percentage 
scores are used to determine achievement level 
(roughly corresponding to grade levels 1--8) on the 
word list portion of the test. On the oral reading 
portion of the test, word recognition and comprehension 
errors are recorded following each story. The examiner 



lERlC 



82 



75 



Interpretation 
Procedures: 



then transfers the errors in each story (grade 
level) into the terms Independent, Instructional, 
Frustration and Listening to indicate the examinee's 
ability, in each category, in correspondence to a 
^rade level. All scores are recorded on the Recapi- 
tulation Sheet, which provided an estimated picture 
of the examinee's composite reading ability. 

Information recorded on the Recapitulation Sheet is 
intended to establish the examinee's estimated 
Independent, Instructional, Frustration and Listening 
levels in a manner roughly corresponding to grade 
levels.- It .0 shows specific strengths and weak- 
nesses ia word recognition and comprehension as well 
as in pronunciation. The interpretation procedures 
are subjective, with Judgments and estimates left to 
the examiner's discration. 



Validity ; 



Content validity consists of the authors' claim that 
the stories are designed to appeal to the penal 
inmate-student. All stories were subjected to read- 
ability formulas (Botel, Dale-Chall and Flesche) to 
coincide with other graded materials. 



R eliability ; 



No information is available on reliability. 



Field Tryouts ; No information is available on field tryouts. 



8 



Ratings ; 



See pages 104-105. 



84 



An Informal Reading Inventory for Use by 
Teachers of Adult Basic Education 



Publisher: 



Description; 



Availability of 
Alternate Forms: 



Office of Adult Basic Education 
State Department of Education 
Concord, New Hampshire 03301 

This test measures readi..g performance from level 1 
through level 6. These levels correspond with the 
levels in graded readers. The inventory has four 
parts: Part I, Word Recognition (testing word attack 
skills and vocabulary level); Part II, Oral Reading 
and Comprehension questions; Part III, Listening 
Ability (present potential level); and Part IV, Visual 
and Auditory Perception and Discrimination (used for 
examinees who cannot function at the introductory 
level of Part 1). 

There are no alternate forms available. ^ 



Adminis t ra t ion 
Time: 



The time required for the test is not specifically 
indicated, though administration probably requires 
from 20 to 30 minutes, depending on how soon a 
student reaches his frustration level. 



A dminis tration 
Procedures: 



The test is individually administered. In Part I 
the examiner exposes words for one second for the 
examinee's flash recognition. If the examinee 
misses the word, he is allowed to analyze it. In 



78 



85 



Part II the examinee reads paragraphs orally and 
answers comprehension questions. In Part III the 
examiner reads paragraphs orally to the examinee 
and the examinee responds to comprehension questions. 
Part IV is administered to examinees who cannot func- 
tion at the introductory level of word recognition. 
The examinee names letters pointed out to him, ^^ives 
the sounds of blends, and writes the initial, final, 
or middle sounds of words read to him. 

Materials Used : Examiner: Informal Reading Inventory Booklet, pencil, 

two 3x5 cards. 
Examinee: Paper, pencil, eraser. 

Scoring The scoring of this test is objective, but fairly com- 

Procedures : 

plicated. The examiner must record each error the 
student makes, using a system of notations. The number 
of words correctly recognized in Part I is totalled. 
In Part II, the examiner computes the number of reading 
errors and percentage of comprehension questions answered 
correctly. In Part III, the examiner computes the number 
of comprehension questions answered correctly. In Part IV, 
the examiner records the examinee's oral errors to letter 
recognition and blending tasks and hand scores the written 
responses to the auditory discrimination tasks. 




86 



Interpretation 
Procedures: 



Validity ; 
Reliability : 
Field Tryouts ; 
Ratings: 



Based on the scores, the examiner computes the 
examinee's independent level, instructional level 
and frustration level. These levels correspond 
closely with comparable levels in a graded reader. 

No information is available on validity. 

No information is available on reliability. 

No information is available on field tryouts. 

Part I, Word Recognition, see pages 106*1C7. 
Part II, Oral Reading, see pages lOA-105, 
Part III, Present Potential Level, see pages 104-105. 
Part IV, Visual and A Jltory Perception and 
Discrimination, see pages 106-107. 



ERIC 



SO 



87 



Individual Reading Placement Inventory 



Publisher : Foliett Educational Corporation 

1010 West Washington Blvd. 
Chicago, Illinois 60607 



Description : This test is divided into five parts. Part I, Word 

Recogniton and Analysis tests a student's knowlege 
of sight words and his ability to decode words he 
cannot immediately recognize. Part II, Oral Para- 
graph Reading tests the student's oral reading 
skills and comprehension. Part III, Present Language 
Potential tests the student's comprehension of 
paragraphs read to him. Part IV tests the student's 
auditory discrimination. Part V, which is not 
scored, tests the student's ability to name letters 
of the alphabet and their sounds. This test is used 
only if the student scores 1.0 on Part I. 



Availability of Alternate forms A and B are available 
Alternate Forms: 



Administration The test has four parts each of which require approximately 
Time: 

10-20 minutes, depending on how many items a student is 
able to complete before reaching his frustration level. 



Administration The test is individually administered, in Part I the 
Procedures ; 

examiner asks the examinee to read words aloud, either 
by recognition or word analysis. In Part II the examinee 
reads paragraphs orally and answers comprehension questions- 



88 



ERiC 



81 



In Part III the examinee listens to paragraphs read 
orally by the examiner and answers comprehension 
questions. In Part IV the examiner reads lists of 
\vords orally and the examinee identifies the word 
in each list that,, begins or ends differently or has 
a different vowel sound in the middle. In Part V 
(used only if examinee scores 1.0 on Part I) the 
examiner points to letters of the alphabet, and the 
examinee names each lettexf and gives one sound of 
the letter. 



Materials Used: 



Scoring 
Procedures : 



1^ 



\ (interpretation 
Procedures; 



Examiner: Student's Test and Scoring Booklet, pencil, 

word recognition wheels, paragraphs on v 
cards. 

Examinee: No equipment needed, 

0 

The examiner records the student ^s errors on each part 
of test using an objective, but (for Parts I and II) 
quite complicated system of notations. The errors 
are then totalled. 

On the basis of the number of items missed per level, 
the student's Independent Level, Instructional Level, 
and Frustration Level are computed. Each level of 
the test is apparently comparable to a gr^de level. 
The Student's Test and Scoring Manual alffo has places 
for the examiner to indicate a student's specific read- 
ing problems — word analysis, recitation, rate difficulties, 
etc. 



ERIC 



82 



89 



Validity : 



Content validity consists of the author's reliance 
upon the researchers' formulas in determining levels 
uf reading difficulty* A concurrent validity study 
correlated three tests of silent reading ability to 
the Individual Reading Placement Inventory. The tests 
used were the Rasof-Neff (r - .89, N « 146), the 

tanford Achievement (r ^= .78, N « 75) and the 
California" Achievement' ^ .87, N = 104). 



Reliability ; 



Reliability coefficients ::«ire obtained by using 
alternate forms in pre- and post-test situations- 
Coefficients ranged from #91 to •98 for overall per- 
formance on the inventory in six different reliability 
studies. 



Field Tryouts : 



ERIC 



Ratings ; 



90 



The Uset's Manual indicates that the field tryouts 

0 

incorporated 410 students, including 124 adult basic 
education students * from Florida, 69 junior-senior high 
school retarded readers from Florida, 111 Junior-senior 
high school retarded readers from Illinois, 86 junior- 
senior high /-'ihool retarded readers from Georgia, and 
20 adult basic education students from a federal prison 
in Florida. No sex, ethnic, or racial breakdown was 
included. 

Part I, Word Recognition and Analysis, see pages 106-107. 
Part 11^ Oral Paragraph ( Reading, see pages 104-105. 
Part III, Present Language Potential, see pages 104-105 



Part IV, Auditory Discrimination, see pa9/^s 104-105. 



H3, 



I 



Publisher: 



Description ; 



Initial Testing Locator Tests 

Adult Continuing Education Resource Center 

Montclalr State College 

Upper Montclalr, New Jersey 07043 

The redding test includes three passages of 
varying difficulty, each followed by comprehen- 
sion questions. It is a preliminary screening 
test, designed to help the instructor tentatively 
assign sttidents to different instructional levels 
o. classes within General Educational Development 
(GED) programs. This test is given in conjunction 
with the Slosson Oral Reading Test. 



Availabili ty of. ^ 

Alternate Forms; ^""^ "° alternate forms available. 



Administration 
Time: ' 



Administration 
Procedures; 



Although the time required for the test varies 
according to an examinee's performance, it would 
seem the test would require less than 20 minutes. 

The test is individually administered. The examiner 
asks the examinee to read Passage A orally and 
answer the comprehension questions orally. If he 
is unable to do this, the test ends. If he is 
able to do it easily, he is given Passage B and 
asked to read and answer questions in the booklet 
by himself, if he can do this, he is given level C 
and asked to read it and respond to questions. Aft%r 



84 



He has reached his highest level — B or C — he is 
given the CTB /McGraw-Hill Test of i^dult Basle 
Edticatlon (TABE) , levels M or D for further 
diagnostic testing. 



Materials Used : Examiner: Test Booklet, pencil 

Examinee: Test Booklet, pencil, eraser 



Scor. 

Procedures: 



The examiner compares the examinee's answers with 
pre-established correct" answers. 



Interpretation 
Procedures: 



Valid; ty : 



Reliability : 



If the student cannot read Passage A, he is probably 
a low-level ABE student. If he can read Passage A ^ 
and Passage B, but not Passage C, he is probably 
higher-level ABE or Pre-GED. If he can also read 
Passage C, he is at least low-level GED. In all but 
the first situation, use the TABE level M or D for 
further diagnostic testing. 

r 

No information is available on validity. 
No information is available on reliability. 



Field Tryouts :,_ No information is available on field tryouts. 



Ratings: 



See pages 104-105. 



92 



ERLC 



85 



Reading Evaluation - Adult Diagnosis (READ) 



Publisher: 



Follett Publishing Company 
1010 West Washington Blvd. 
Chicago, Illinois 60607 

or 

Literacy Volunteers of America, Inc. 
222 West Onondaga Street 
Syracuse, New York 13203 



Description: 



Availabilitjy of 
Alternate forms: 



Administration 
Time: 



Administration 
Procedures: 



The test has three parts. Part I, Word Recognition, 
tests the student *s knowledge of sight words. 
Part II, Word Analysis, tests the student^s decoding 
skills. Part III, Reading Inventory, tests the 
student *s oral reading and comprehension* 

Alternate Forms_.a a'lJd 2 are^undeT none H!^er far the 
Reading Inventory (Part III) of the test. 

The three parts of the test do not need to be ad- 
ministered at the same time. Administration times 
for Parts I and II are estimated at five and ten 
minutes respectively; estimated administration time 
for completion of all levels (B-J) of Part III is 
half an hour. 

The test is individually administered. In Parts I 
and II, the examinee reads words and sounds aloud 
while the examiner racordd errors for each list. 
In Partem, the examinee reads stories and answers 
questions aloud while the examiner records errors 
for each story. 



86 



93 



Materials Used: 



Examiner: Testing/Record Booklet, pencil. 
Examinee: Reading Lists and passages from test 
booklet. 



Scoring 
Procedures : 



Interpretation 
Procedures : 



Validity : 



Scoring is accomplished through an objective and 
fairly simple process of recording student scores 
for each of the test's three parts on a Summary 
Sheet. Correct scores are converted to percentages 
for Part I (Word Recognition); in Part II, specific 
diagnostic information is recorded on a, variety of 
reading subskills, such as knowledge of alphabet and 
letter sounds. The difficulty of reading and listen- 
ing comprehension selections in Part III corresponds 
Tougttly t:T>- grade leveis^, -and passing any- selection - - 
depends upon not exceeding a specified error count. 
The total passing score is converted to equivalent 
grade level. The test is intended for administration 
on a pre-post basis. 

The Test Summary Sheet provides a detailed reading 
profile for use in planning, a s^icific instructional 
program for the examinee. The test booklet also 
provides suggestions for analyzing and using the test 
scores for individualized prescriptive programs. 

Content validity relies on the acceptance of the 
test items by teachers in aduit education. 



94 



87 



Reliability : No Information Is available on reliability. 



Field Tryouts ; 
Ratings : 



No information is available on field trvouL^. 

Part I, Word Recognition, see pages 106-107. 

Part II, Word .\nalysis, see pages 106-107. 

Part III, Reading Inventory, see pages lOA-105. 



88 



TEST EVALUATIONS 



The test evaluations on the f jllowlng pages are divided into the three 
major categories used in the Test Review section: criterion-referenced 
functional Mteracy tests, standardized tests, and informal tests. The 
standardized tests and informal tests are further categorized under subheadings 
Indicating the specific behaviors being tested (e.g., oral reading, spelling, 
vocabulary). 



ERLC 



The behaviors or skills listed under standardized tests are as follows: 
o General Educational Development Performance Tests . These tests 

predict examinee performance on the General Educational Development 

Test. 

o Multiple Reading Skills Tests . These tests yield results of a 
c^nposita nature -<e*g^^ word meauin& and passage compxehension) not^ 

readily assigned to a single category, 
o Reading Comprehension Tests . These teJ^s measure the ability to 

comprehend material redd silently, 
o Spelling Tests . These tests measure spelling ability, 
o Vocabulary Tests . These tests measure knowledge of word meanings. 

Behaviors or skills listed under informal tests include the following: 
o Oral Reading Tests . These tests measure the ability to^ read passages 

aloud and to understand what was read, and are sometimes used to 

measure the level of listening comprehension as well, 
o Reading Comprehension Tests . These tests measure the ability to 

comprehend material read silently, 
o Recognition or Discrimination Tests , These tests measure the ability 

to discriminate between sounds, to pronounce the sounds made by 

letters and blends, or to recognize sight words. 



'® 89 



o Vocabulary Tests * These tests measure knowledge of word meanings. 
The tables containing ratings for each test appear on facing pages. 
The left page Includes three major headings: Test Name, Measurement Vali^^ty, 
and Examinee Appropriateness. The right page also Includes three headings: 
Technical Excellence, Administrative Usability, and Total Grades. These 
headings function as follows: 

o Test Name . The test name appears entirely Iq upper case letters. 

Subtest names are in upper and lower case, 
o Measurement Validity, Examinee Appropriateness^ Technical Excellence, 
and Administrative Usability . Individual criteria are listed under- 
neath each of these headings; following the criteria are the 
possible ranges of points assigned on each cr^'terlon. The actual 
entries for each test (or subtest) are listed In the body of the table 

o. Total Grades . Grades pf.good^^ falr^ or poor Bre assigned to each_ 

test (or subtest) summarizing the ratings In the four major criterion 
areas (measurement validity, examinee appropriateness, technical 
excellence, and administrative usability). 

A detailed discussion of the criteria used to evaluate the tests Is 
presented in the section E valuative Criteria . 



90 ^ 



99 





MEASUREMENT 
VALIDITY 


EXAMINEE APPROPRIATENESS 












hpi>S"Of> 






TEST NAME 


^ u 
Z 2- 
2^ 

c 

C 0 

uu 


a? > 

^ 


o 
7i 

") 


i, 

u 

> 


i 

0 

<J 


0 

u 

3 

fan 
c 


5 
> 


§^ 

5 c 
ii 


c 

0 

> 2 

o c 

a; 

■6 ^ 

s t 


I 

C 

0 


/ 

o 
a 
\n 

or 




t 

a 






0-4 


1) 1 






(>.4 


()'} 


0 1 




l( 1 








. I . CRITERION "REFt RE NCR D 

FUNCTIOi^AL LITERACY TESTS 
























^ 1 




ADULT PERFORMANCE LEVEL 


7 


0 


1 


1 


*> 


J 


1 




1 


1 


Ml 




1 


BASIC SKILLS READING MASTERY TEST 


7 


0 


0 


1 


z 


3 


1 


1 


1 


1 

1 — 


tfr 


1 — 

' 1 


1 

1 i - 


READING/EVERYDAY ACTIVITIES IN LIFE 


7 


2 


1 


1 


9 
Z 


4 


1 


1 


1 


1 


Wr 


2 


1 


Wisconsin test of adult basic educattott ' 

Life Coping Skills 


6 


0 


0 


1 


2 


2 


1 




1 




_^ 

I 


h- 

Wr 


— 
2 


1 




t 
i 

1 


1 

1 










i 

1 — • — ; , 




II. STAmARDIZED TESTS 


^ 


i 




i 






A. General Educational Development 
^rvformance Testis 


! 








1 : 






GENERAL EDUCATIONAL PERFORMANCE INDEX, 
Correctness & Effectiveness of Expression 


5 


1 


1 


1 




3 


1 


1 


0 


1 

1 


k 


I 

il 


1 


GENERAL EDUCATIONAL PERFORMANCE INDEX, 
Literary Interpretation 


5 


1 


1 


0 


1 


2 




1 

1 


0 


1 




1 

1 

1 


1 


















— T 1 " 

1 ! 

j ! 




B. Multiple Reading Skills Tests 










I ' 




h- 






i 


i 




SRA READING INDEX 


6 


1 


0 


1 


1 


4 


1 


1 


1 


1 




2 


1 


WISCONSIN TEST OF ADULT BASIC EDUCATION, 
Word Meaning and Reading 


6 


0 


0 


0 




1 


2 


I 1 


1 


— ' 




hr — 
Wr 


2 


1 




! 

1 
1 
1 








— . 




— 1 








1 



























































Note:. The body of the table includes the ratings assigned to each test for 

individual criteria, A figure of zero on any criterion indicates non- 
compliance or lack of information. 



The meanings of the symbols under "Response Mode** are as follows: 
"Or** - Oral; "Wr" = Written; and "Ml" = Mixed. 



100 



91 



TECHNICAL 
EXCELLENCE 



fl«liabilitv 



5l 



0 3 



0 0 0 

— I — I — 



0-^-0 -2-1 



0 0 2 1 



0 0 2 1 



10 0 1 



0,001 



0 0 11 



0 0 2 1 

— u 



~ri T" 



Administration 



0 o 



If 



?5 



1 3 
3 C 

< 



0-2 



ADMINISTRATIVE USASlUn 
• _ 



iiiterpreiation 



1 !i 



0-1 , 0-2 



0-1 



0-1 



c o 



U K O 



0-2 



0-2 



< 



0-1 



0 ' 0 



^ i 0 : 1 : 1 2 1 ! i:i ; 2 : 1 



1 i 0 i 1 1 1 1 j 1 1 2 1 



0 



0 ; 0 



^ i 0 : 0 , 1 i 1 1 i 1 1 : 0 0 I 0 ! 0 

— I — I — . — I — , — ( — — j — j — 



0-1 



0 0 



TOTAL GRAOES 



1 i2 1 



0 1 



1 I 



i I 



1 .2 1 



1 ! 1 

— \ 



-H 



0 ; 1 

i — 



1 12 1 



1 1 ' 2 ;i 

—1 — I — t— 



1 i2 1 



0 i 0 



-1 



Good -Fair-Poor 



FGPP 



FGFF 



FGFF 



FFFP 



FFPF 



FFPF 



FGPF 



FFFP 



Poor for Measurement Validity 
Good for Examinee Appropriateness 
Fair for Technical Excellence 
Fair for Administrative Usability 



101 





MEASUREMENT 
VALIDITY 


EXAMINEE APPHOPRIATENESS 


T^ST NAME 


c c 

0 0 


«/ > 

3 t 

0 1 


\i 
-\ 








\ 

CI 


1. 

c 

cr 


1 

o 


■P 
y 

3 

vn 
r 


1 

> 


> 


< a 




I* 
? 

5 
f> 

4J 


5 

- / 




'1 H 






'1 1 




u 4 


{»-! 


n 1 










0 1 


r. Reading Comprehension Tests 














1 

i 


i * 




1 

, i 1 


ADULT BASIC LEARNING EXAMINATION, Reading 


5 


1 


1 


1 


1 


4 


] 

1 i 


1 

1 i 1 1 


Wr ; 2 i 1 


BASIC OCCUPATIONAL LITERACY 
TEST , Comprehension 


4 


0 


1 

1 


1 


1 


4 


1 

1 


i 

1 ; 1 1 

1 4 ' 


Wr 2 1 

» i 


TESTS OF ADULT b^iSIC EDUCATION, 
Comprehension 


4 


0 


1 

1 1 1 


1 


4 


1 ' 1 1 1 Wr 1 0 

i_ -i 1 * U— — 








1 ' 






1 ! , ; , ; 




^ 

D. Svelling Tests 






— ( ' 

i 

— i 






ADULT BASIC LEARNING EaAti IN Al iON , bpeiling 


6 i 2 




2 


3 111 1 Wr 2 


1 

! 










; — \ — ' — 1 

„., 1 ; i u — . — , 




r. VOCCLuULG,Ty lesvs 




i 


i 
1 

\ 


\ i 


i 




ADULT BASIC LEARNING EXAMINATION, 
Vocabulary 


5 ! 1 


— -t- — t 

1 i l ! 2 


All 


1 — \ — ' — r- 




1 


BASIC OCCUPATIONAL LITERACY 
TEST, Vocabulary 


4 ; 0 


1 




4 


i 1 


11 1 Wr I 2 

1 \ ! ^ ^— 


1 


TESTS OF ADULT BASIC EDUCATION, 
Vocabulary 


I 

4 i 0 


1 




4 


r- 
1 


1 1 1 1 Wr ' 1 

1 j > i ' 


0 








1 f' 
i 




J — 


ill 




— — . 
















i — 






T 

1 














1 ' ! 

; f 
^ i 1 


























i 
i 





































Note. The body of the table includes the ratings assigned to each test 

for individual criteria. A figure of zero on any criterion indicates 
non-compliance or lack of information. 



The meanings of the symbols under "Response Mode'' are as follows: 
"Or" - Oral; "Wr" * Written; and "Mi" » Mixed. 



102 



93 



TECHNJCAl 
tXCELLENC 


F 


ADMINISTRATIVE USABILITY 




R 


t*tt<«t)H 


My 






AdrntntSlrfttiu 








tntarpretation 




> 

n 


TOTAL GRADES 


V 

1 

I E 


i 
1- 

K 

*' 
a 


> 

z 

u 


C 

L 

V 

a. 


a ? 


E t 
E ^ 
< 


0 

C 


»- ^ 

O 0 

V 5 

e " 

< 


c 


? 
»- 


1' 

CT 
C 

le 

QC 


> 

> 

6 


r 
c 

X ^ 
?l 

0 
L> 


o 
? 

c 




5 n 
? 3 

^ 2 
>o 


Is 

5 QC ^ 
U X o 


0 

in 

£ « 


|i 
< 


!• 1 












0 ? 


0-1 




o 1 


0 1 




0-2 




0-1 




0-2 


0-2 


0-1 


0-1 


Goo<l-Fair-P<x3r 


















k 














1 












0 0 :2 li 

. : , 1 


1 


0 1 

4 . 


f 

i 

i 1 


1 

|2 


1 


1 


1 

|1 


2 


il_. 


0 


0. 


2 


_ 1 _ 


_1_ 


__L 


FGTQ. - - 


" 0 0,01 


1 i 1 

1 — ■ w 


\ 1 


2 


; ! ! ' 

10 1 2 0 


0 


0 

i 


2 


1 2 


1 


1 


PGPG 


0 0 0 0 


_1_0 


1 


1 


2 


1 11 2 1 


. 0 


1 
1 

0 ; 0 


r- 
2 


1 


1 


PGPF 




—4 — . 




— — — T ■ > f 

i ^ i 1 ! 

! : ! i ' 










0 














0 2 1 


1 il 1 12 


1 1 :i 2 1 ; 0 i 0 


r 

2 




1 


1 


1 


FGFG 




! \ ■ — H — 




; ! 














r , ] 
..-.I ( ^ — 1 — 




- -^T- 1 

i 

j 












0 0 11 


1 ! 1 ; 1 12 

: 1 1 1 1 


1 1 1 il i 2 '1 

■A- ~ 1 » L. _ ^ . 


i 

0 ; 0 


2 


2 


1 


1 


FGPG 


0 


0 ;0 ,1 


1 


2 


; J 

1 1 

_1 i 0 11 12 


0 ; 


1 

0 1 0 


2 


2 


1 


1 


PGPG 


0 


0 0 0 




2 




1 ; 1 |i i 2 ; 


1 i 0 0 


0 


2 


1 


1 


?GVG 






( 


^ j 




1 

--f- — 


— \ — ; 


1 


1 

i 












t 

— j — 




1 1 

L_ i 1 


; 1 


! I 
















j 

—.4 , . — ». 1 ■ * ■ 




I 

i 









i 

— i 




! i 


i 














t 




} 
1 








1 






L 






— ^. 


























c 



























The entries under Total Grades summarize test performance on the four major criterion 
areas, In this order: 1, Measurement Validity, 2. Examinee Appropr' ^iteness, 
3, Technical Excellence, and 4. Administrative Usability. Thus, the entry ''PGFF" 
is to be Interpreted: 

Poor for Measurement Validity 
Good for Examinee Appropriateness 
Fair for Technical Excellence 
Fair for Administrative Usability 

103 





MEASUREMENT 
VALIOin 


EXAMINED APPROPRIATENESS 




TEST NAME 


J 

0 

J '-> 


■0 

^ > 

= 1 


c 
u 
7 

0 1 








H- 

c 
a 

I) 1 


> 

1 
cc 

f- 

l> 1 


0^ 

c 
0 
(J 

0 : 


/) 
5 

r 

c 

i» 4 


8 
> 

0 1 


> 

z 2 
r 0 


> - 

ii 

< Q. 


0 


c 

a 




TTT INFORMAL TESTS 


' H 


























A Ovnl Rpndina Tests 














1 

\ [ 


1 


1 

1 

\ 




TNFORMAL TEXTBOOK TEST 


1 


0 


o| 

■ ■ f 


1 


1 


1 


1 

1 ^ 0 

1 i 


! 

0 ! 1 

— u — 


1 

; 2 


1 


Idaho STATE Pt«l'n:;STlARY 

INFORMAL READING INVENTORY 


3 


0 


0! 


1 


2 


3 


ij 1 


i 

0 


1 


Dr 1 2 

\ 


1 


INDIVIDUAL READING PLACEMENT INVENTORY, 

KjLcLA. JrclLclj^icipii i\^«v»j.ii^ 


4 


2 


1 

0 


1 


1 


1 


1 ' \ 


0 


1 


Dr 1 2 


1 


INDIVIDUAL READING PLACEMENT INVENTORY, 
pTAflpnf LAnfftia26 Potential 


4 


2 


0 




1 


1 


1 


1 ' 1 


0 


1 1 


3r 2 


I 


INFORMAL READING INVENTORY, 
urai Keaaing 


1 

0 0 


0 


1 


1 


1 


0 0 


0 1 Or 2 


1 


INFORMAL READING INVENTORY, 
cLeseriL. jroLciiL.xcix ijcvcx 


i 

0 , 0 


0 


1 


1 


1 




1 

0 1 pr i 2 

) 1 — +- — 


1 




1 




' 1 


1- 


-0- 


l-;~0- 




HI 1 2- 

\ 


4^ 


TX7TTTAT TPCTTMH T OPATOR TFSTS 









READING EVALUATION—ADULT DIAGNOSIS, 


1 0 


1 


1 1 


1 


3 


1 


1 


1 


1 


i 

Or I 2 


1 








t" 

1 
i 

\ 


1 










^ 


\ \ 




p Ronrlnnn CorrtDveheTisioTi Tests 


i 

i 




1 












i 

] 


LL 




ADULT BASIC READING INVENTORY, 
Pnnfpvt Reading 


0 2 

. " 


1 


1 


■i 


3 


1 


1 


1 


! 1 

-1 


i i 

Wr ' 2 

■i i 






1 

) 
















1 








C. Recoqnition or Discrimination Tests 




























ADULT BASIC READING INVENTORY, 
SlKht Words 


0 


0 


1 


1 


1 


4 


1 


1 


1 


1 


Wr 


2 


1 


ADULT BASIC READING INVENTORY, 
Sound and Letter Discrimination 


0 


0 


1 


1 


1 


4 


1 


1 


1 


1 


Wr 


2 


1 



Note: The body of the table includes the ratings assigned to each te 
individual criteria. A figure of zero on any criterion indica 
non-compliance or lack of information. 

The meanings of the symbols under "Response Modfi" are as folio 
"Or" - Oral; "Wr" = Written; and "Mi" = Mixed. 



104 



ERIC 



95 



0 TECHNICAL 
EXCELLENCE 


V 

ADMINISTRATIVE USABILITY ^ 


tOTAl GRADES 






AdminisUatton 


C 
C 


Utterpretotion 


?| 
« 8 

< 


> 

§ 2. 


V 

1 

V c 


7i 
t 

z 
7 

V 


> 

2 1 

- c 
0 

'J 


> 

a. 
q: 


o 

c f 

6 
< 


i ? 

a*- 


> (T) 
(j5 


0 o 


a: 


V 

o> 
c 
« 


> 

S/J 

> 

6 


0 

^ i 
' > 

'J 


a 


J ~ 
9 
K 


It 

^ 2 
>o 


It 

5 ^ c 
G X 0 

19 


c 

0 " 
- 1> 
1^ 




0 3 




U I 


u 1 


0-1 




0-1 


C-2 




0-1 


0-1 






O'l 


0-1 


0-2 


0-2 


0-1 


0-1 


Good-Fatr-Poor 


















































i 




i 












— • 












0 ' 0 ! 0 io 
.. ■ i ■■ — — — 


: ; i i i 1 

11 0 . 1 1 ,1 i 1 


0 


T 

2 ' 1 


0 


0 


0 


1 


0 


0 


PFPP 


0 0 0 :o 


— ; : 1 1 ' 

1 i 0 1 1 1 1 1 J 1 


i — 1 

0 


r - — ■ 


1- — ^ 

0 

i» 1 


0 


0 


2 


1 


1 


PGPF 


3 0 0 ^1 ' 


,a il l 11 ijii 


2 1 


0 


, 

0 


— - 

0 


2 


1 


1 


FFFF 


3 0 0 1 


Nil , ; ] 

^'i li i i ii:ii,2!i ;o 

r ^ » . ^ « 1 i 4 4 1 


0 


1 — ' 


2 




1 


FFFF 


0 0 0 0 


1 1 0 1 0 1 1 1 1 l/'! 0 1 0 

— — 1 > 4 — . . . — : — J 1 


0 

, — 1 




1 1 0 


0 • 


PPPP 


0 0 0 0 


I'O 011111 2 1 0 


0 


0 


1 






PFPP 


!- • 1 






1- — -J 

0 0 1 ; 0 


0 


0 i 1 




PPPP , 


0 0 0 0 




10 1 ,1 1 : 1 


0 


0 


0 0 0 0 


1 

, 


0 : 0 1 :1 1 


l|l 1 2 ,1 


0 




0 


H 

0 I 1 


.1 


1 


PGPF 


— ^ ♦ 












i 

i ; 


























1 ! ; i . 














0 0 2 1 


1 


1 0 






1 i 1 1 i 2 ! 1 I 0 


n 

u 


0 


2 


0 


0 


PGFF 








1 


i 




1 — 














_ . J — 

t 




1 
j 




0 




! J 



















0 .0 12 


1 


1 
1 


' j 

1 i 0 ! 1 

: 1 


2 


1 


1 


2 


1 


0 


0 


0 


0 


0 


0 


PGFP 


0 


i ' 

0 |2 


1 


1 


0 


1 


2 


1 


0 


1 


2 


0 


0 


0 


0 


1 


0 


0 . 


PGFP 



The etitrles under Total Grades suramariz** test performance on the four major criterion 
areas^ In this order, 1. Measurement Validity, 2. Examinee Appropriatenefssi ^ ^ 
3. Technical Excellence, and 4. Administrative Usability. Thus, the entry "PGFF" 
is to be interpreted: 

Poor for Measurement Validity 
Good for Examinee Appropriateness 
^ Fair for Technical Excellence 

Fair for Administrative Usability 

) 



ERIC 




9G 



105 



/ 



ERLC ^ 




\ 



ThST MAM£ 


MEASUREMENT 
VALIDin 


EXAMINEE APPROPRIATENESS 


|5 

'J o 


C 4^ 
4! > 

sl 


'J 
7i 


u 
c 

> 

cr 

c 
2j 




FlKOMt 


1 


r 
« 

5 


5 

£ 

c 


3 
0 
> 




c 

> 

S c 


•J 


> 

c 
a 


a or 








0 1 


0 1 




0 4 




'1-1 






' 1 > 




Cm nPCOQYiitioYl OV DisCvilTliYKl'tiOTL T^StS 

. (CnntimjLP.d) 




























CYZYK PRE-READING INVENTORY 


0 


0 


0 


1 


1 


3 


0 


0 




Wr 


2 




HARRIS GRADED WORD TEST 


0 


0 


0 


1 

' 


1 


1 


0 


0 


Ojl 


Or I 2 




INDIVIDUAL READT^^^> PLACEMENT 
INVENTORY, Auditory Discrimination 


4 


2 


0 


1 


2 


1 




1 


i 

0 1 1 


Dr • 2 




INDIVIDUAL READ[NG PLACEMENT 
INVENTORY, Word Recognition 


4 


2 


0 


1 


1 


1 


1 i 1 




Or I 2 




INFORMAL READING INVENTORY, Visual 

and A^uditory Perception and Discrimination 


0 


0 


o' 1 


2 


1 


1 0 0 ! 1 


■ 

Dr 2 




INFORMAL READING INVENTORY, 
Word Recognition and Analysis 


i 

0 0 




1 


ili^olo 1 


Or 


<2 




READING EVALUATION--ADULT DIAGNOSIS, 
Word Analysis 


2 0 


1 

0 ! 1 


2 


1 1 1 : 1 ! 0 1 br 


2 




READING EVALUATION—ADULT DIAGNOSIS, 
Word Recognition 


2 ' 0 


1 

0 ' 1 


1 

|2 


i 

1 j 1 


1 


0 


, 1 br 


1 

2 












1 








i 1 








Z>. Vocabularnj Teste 






i 










i : 
' 1 

\ \ 


1 

1 




ADULT BASIC READING INVENTORY. 
Word Meanine (Listening) 


0 i 2 


1 


k 


j 1 


4 


1 




1 ' 1 Wr 


2 


1 


ADULT BASIC READING INVENTORY, 
Word Meaning (Reading) 


1 2 


1 


! 

1 


,1 


4 


1 


1 


1 ! 1 Wr 1 2 


1 


















. .1 . . . 


i 


1 























j 




! 





























































Note: The body of the table Includes the ratings assigned to each test for 
individual criteria, A figure of zero on any criterion indicates 
non-compliance or lack of information. 

The meanings of the symbols under "Response Mode" are as follows: 
"Or" = Oral; "Wr" = Written; and "Mi" = Mixed. 



106 



ERIC 



37 



TECHNICAL 
EXCELLENCE 


ADMINISTRATIVE USABILITY 




RAliabiMly 


^ ■ 


Admtnistration 




lnt«rpr«t«tt£}n 








3) 


£ 

iX 
A 

i) 


> 


B 

a 
cr 


o o 

c <? 

•=1 

< 


c 
o 

to 

ll 

r 


?| 

to (C 


o o 

IS 

II 

■o 
< 


C 

o 

u 
in 


c 
5 


O 

c 

10 

cc 


>■ 
> 

o 


c 
o 

4; i 

5 % 

^§ 

'J 


c 

2 
a 

c 


(A 
-J - 

In 

CC 


? 


^« 

c 

y X o 


J) 
c 

- a; 

^& 

U 


c c 

< 


> 
n 

^ 0 

u 


TOTAL GRADES 


0-3 






O-l 


0-1 


0-1 


0-1 


0-1 


0-2 


O'l 


n 1 


0-1 


0-2 


0-1 




0^1 


n.7 


0*2 


0-1 


0-1 


' Good-Fatr-Pocr 


































ti 










0 


^ 

0 


0 


0 


1 




0 


1 


' 

1 


1 


0 


0 


A 

u 


A 
U 


u 


A 
U 


A 
U 


A 
\J 


0 


0 


PFPP 


0 


u 


U 


1 

0 


1 


1 




A 


I 


2 


X 


1 


0 




1 

X 


0 


0 


0 


1 


0 


0 


PPPP 


i 


0 


0 


1 


1 

1 


1 

X 


1 

X 


1 2 


1 


1 1 

0 


1 

I 

1 — ^ 


2 




0 


0 


0 


2 


1 


1 


FFFF 




0 


0 


^ — 

1 


1 

1 


1 

X 


1 

X 


1 


1 


' 

1 

X 


' — — 1 

1 


1 




1 

X 


1 


0 


0 


0 


2 


1 


1 


FFFF 


0 


^ , 

0 


0 


1 

1 

:o 


— _ — ^ 
1 


1 1 

n 


1 ' 

0 


1 


2 


: 




O'l 


2 


1 


0 


0 


0 


1 


0 


0 


PFPP 


0 




0 


i ' 

0 


1 


0 


0 


1 


\ 

2 


1 


oil 

1 i 1 


2 


1 

1 ( 


0 


0 


0 


1 


0 


0 


• PFPP 


0 


' p" — 

0 0 


0' 


1 

1 


1 ' 

1 


0 


1 


1 

2 


1 


] 

0 1 

J 1 


2 

1 . 


1 


0 


0 


0 


2 


0 


0 


PFPF 


0 


0 


0 


0 


1 


1 

1 — • 


\ — ■ 

0 


1 


h- 
2 




1 


0 


1 


2 


1 


0 


0 


0 


2 


*0 


0 


PFPF 


' ( . 

, i -< — . 






i 

1 — I — 


r 






— 


























t 1 






L 


















0 


: 0 


2 


i 1 


1 

X 


1 


0 


1 


2 


1 


1 


1 


2 


1 


0 


0 


0 


1 


0 


0 


PGFF 


0 


* 

r 

0 


2 


i'l 


1 


1 


0 


: 1 


2 


1 

: 1 


1 


1 


2 


1 




0 


0 


0 


2 


0 


0 


PGFF 


\ 1 

1 




— ■ y . - . - 

1 




1 
























1 , \ 




! 

1 




1 

i 
1 


























i 


i 
i 






i 






1 




























r 'f- - 

i f 





































The entries under Total Grades summarize test performance on the four major criterion 

areas, in this order: 1« Measurement Validity, 2, Examinee Appropriateness, 

3. Technical Excellence, and A. Administrative Usability. Thus, the entry "PGFF" 
is to be interpreted: 

Poor for Measurement Validity 

Good for Examinee Appropriateness 

Fair for Technical Excellence 

Fair for Administrative Usability 

" 107 

ERIC 98 



SUMMARY 

. In summarizing the results of these evaluations, It Is useful to 
examine the different groups of tests — criterion-referenced tests of 
functional literacy, standardized tests, and Informal tests — for strengths 
and weaknesses. Examining these specific groups of tests reveals several 
trends. 

Crlterlon-referenced tests of functional literacy are the newest type 
of tests on the adult literacy testing market. Their availability Is a result 
of the recent Interest In teaching and measuring functional skills for adults. 
The neimess of these tests is reflected by the fact that only one of the four 
tests has been made available for general dissemination by a commercial 
publisher although some of the others can be obtained from the authors. 

The primary strength of the criterion-referenced tests lay in their 
appropriateness for the examinees. In general, these tests demonstrated 
sensitivity to the testing requirements of their target group. However, 
these tests were generally not rated highly otherwise; most of their ratir^gs 
in the other three areas were "Fair." Particular points to which the test 
developers have not attended are establishing concurrent or predictive 
relationships, developing and testing alternate forms — including alternate form 
reliability — and determining test-retest reliability. Doubtless, failure to 
attend to these points stems from the problem of obtaining adequate time and 
money to accomplish all important tasks. For tests in the early stages of 
development, considerations of time and money would probably prevail regardless 
of the proclivities of test developers. Nevertheless, the need for adequate 
data remains, and should become the focus of subsequent efforts. 



ERLC 



99 



111 



ERIC 



The standardized tests evaluated were generally accompanied by the 
most complete information concerning their development and use. The major 
strengths of these tests lay in their high appropriateness for examinees, 
and, to a lesser extent, their administrative usability. Also, they were 
often accompanied by extensive data describing results from field tryouts 
and other studies. On the negative side, much of the data presented did 
not strongly support the measurement validity or technical excellence of the 
tests. The psychometric quality of some tests clearly called for improvement. 

Informal tests were found ^to vary the most in quality, and ;were the 
weakest overall. They presented a definite advantage in that they could be 
easily and quickly administered in a low threat envirorjnent. However, they 
entailed many problems". Most lacked adequate directions for admiaistration, 
scoring, and interpretation; many included no description of design or 
development procedures. The inadequacy j of this information was evidenced by 
the many "Poor** ratings the tests received. However, some informal tests had 
undergone substantially more testing than the others and therefore stood ir 
contrast to the others in terms of quality. 

Even though informal tests are ty|[ically used for a fairly limited 
purpose — the initial placement of students or the diagnosis of specific reading 
problems — their psychometric quality should not be ignored. In fact, the 
diagnosis of reading problems is so important that it ought to be done with 
thoroughly tested instruments. Omission of essential information limits the 
utility of any test and opens its results to question. 

Continuing Development of Functional Literacy Measures 

In addition to the tests listed earlier in this report, three other 
developmental efforts represent the continuing interest in developing measures 
of functional literacy. These efforts were not noted elsewhere in this report 



112 . = 

100 



ERIC 



because no tests associated with these efforts are available. They are 

briefly noted here to provide information about potential sources of 

literacy measures. 

Current work is ongoing at the Natiqnal Assessment of Educational 

Progress (NAEP). NAEP provides continual, direct assessment of educational 

outcomes nationwide in several learning areas for four age groups, including 
26 

young adults. Although NAEP does not publish its tests, it does periodically 

release some test items with attendent iter data. The measurement of reading 

is organized around nine themes: (1) words and word relationships, (2) virtual 

aids, (3) OTitten directions, (4) reference materials, (5) significant facts, 

(6) main ideas and organization, (7) inference, (8) critical reading, and 

(9) reading rate. Some of these themes are obviously more closely related to 

measuring functional skills than others. The items for each theme, as well 

27 

as available data, can be found in several NAEP publications. 

Work conducted by the Human Resources Research Organization (HumRRO) on 

the measurement of work-related literacy for military occupations provides a 

28 

second source of information on measuring literacy. The primary value of 

the HumRRO vork lies not so much in the tests themselves — since even if 

available, they would be applicable only to military specialties; but rather 

In the comprehensive methodologies that HumRRO has established for determining 

the reading requirements of occupations. It would be particularly productive 

to apply their methodologies to . ^termine what literacy skills are needed to 

function adequately in typical daily tasks and selected occupations or ^groups 

of occupations. Only by first determining such skills can educators of adults 

provide training in functional literacy. 

A third effort involves the Adult Functional Reading Study conducted by 

29 

the Educational Testing Service (ETS). Initiated in 1970, the study began 
with a national survey to determine typical tasks of adults, construction of 



113 

191 



a measurement instrument for determining the ability of adults in performing 

these tasks, and a national survey assessing the attainment of skills for 

such tasks. More recently, project staff examined the relationship of 

performance on functional reading tasks to decoding skills and performance 

on clo2e tests, and developed an experimental test for assessing reading 

30 

competency in schools. The results of these studies are available, 
^although to date no tests of functional literacy from the project have 
become accessible. 

Conclusion 

The reviews and eva]uaMons in this report indicate that adult literacy 
testing is still a developing field marked by broad variety in the quality 
of available instruments. And despite the recent emphasis on reducing adult 
illiteracy in the United States, very few instruments have been developed 
and tested specifically for use with adults. 

Much recent work in test development has concentratad on identifying 
Important functional skills and constructing instruments to measure these 
skills. Further test development using the criteria suggested in this 
report can help make these tests highly appropriate for use with adult 
students. While much has been done, test users and developers must continue 
to combine their competence and efforts to produce Instniments responsive 
to the testing needs of adults. 



O 114 

ERIC .. loiJ 



FOOTNOTES 



Request for proposal to collect and evaluate tests of functional adult 
literacy (Office of Planning, Budget, and Evaluation, U. S. Office of 
Education, September 1974). 

2 

Minthrop R. Adkins, "Life Skills Education for the Adult Learner." 
Adult Leadev8h^p, 22, No. 2 (1973), 55-58, 82-84. Also Louis Harris et al.. 
Survival Literacy Study." (New York, 1970) ED 068 813. 

3 

Such sources concerned with adult basic education tests include Kathleen 
Vanderhaar, Donald W. Mocker, Robert E. Leibert, and Vera Maasa, Teats for 
Adult Basvo Ec ^atton Teachers: "28 Suggestions for Classroom Teachers" 
(Kansas City, Missouri: Center for Resource Development in Adult Education. 
''University of Missouri-Kansas City, 1975); and Joan Fischer, Jane F. Flaherty, 
and Robert H.Arents, Testing Guidelines for Adult Basic Education and Uiqh 
SchooL Equivalency Programs (Trenton, New Jersey: The Office of Adult Basic 
Education, Department of Education, 1973). For a review of all tests in print, 
see Oscar Krisen Euros, Tests in Print II: An Index to Tests, Test Revi^s, 
and the Lvtevature on Specific Tests (Highland Park, New Jersey: The Gryphon 
Press, 1974). -"^ 

4 . , ' 

Statistical Abstract of the United States (Washington, D.C.: U S 
Government Printing Office, 1974). 

^ Literacy Among Youths 12-1? Years, United States (DHEW Publication 
o??t Baltimore, Maryland: National Center for Health Statistics, 

6 

Louis Harris, et al., "Survival Literacy Study." 

Alex M. Caughran and John A. Lindlof , "Should the 'Survival Literacy 
Study Survive?" Journal of Reading, 15, No. 6 (1972), 429-435. 

8 

Louis Harris, et al., "The 1971 Reading Difficulty Index: A Study of 
Functional Reading Ability in the U.S." (New York, 1971), ED 057 3X2. 

9 

Norvell Northcutt, Charles Kelso, and W. E. Barron, "Adult Functional 
Competency in Texas" (Austin, Texas: University of Texas Press, 1975). 

Bormuth, "Reading Literacy: Its Definition and Assessment," 
Readtng Research Quarterly, 9, No. 1 (1974), 7-66. 

John R, Bormuth, "Development of Readability Analysis," Report No. 7-0052 
(Chicago: University of Chicago Press, 1569). 

12 

-n^ 1 ; ^u.f;/°''^^x, "^f ^^"8 Ky«'" 5^^^^ Si^i^s in Reading, Ed. H. Levin 
and J. P. Williams (New Yorkj Basic Books, 1970), pp. 134-146. 

™' 103 



FOOTNOTES (Continued) 



K. S. Goodman, "Analysis of Oral Reading Miscues: Applied Psycho- 
linguistics," Reading Research Quarterly ^ 5, No. 1 (1969), 119-130. 

Eleanor J. Gibson, "Learning to Read," Theoretical Models and Processes 
of Reading, Ed. Harry Singer and Robert B. Buddell (Newark, Delaware: 
International Reading Association, 1970), pp. 315-334. 



15 



Bormuth, 1974, pp. 7-66. 



William S. ^ray. The Teaching of Reading and Writing (Switzerland: 
United Nations Educational, Scientific, and Cultural Organization, 1969), 



17 



18 



19 



Request for proposal. 
Request for proposal. 



Thomas G. Sticht, Ed., Reading for Working: A Functional Literacy 
Anthology (Alexandria, Virginia'. Human Resources Research Organization, I'i/^). 

Ronald P. Carver, "Reading as Reasoning: Implications for Measurement," 
Assessment Problems in Reading, Ed. Walter H. MacGinitie (Newark, Delaware: 
International Reading Association, 1973), pp. 44-56. 

21 

Carver, pp. 44-56. 
22 Bormuth, 1974, pp. 7-66. 

Walter H. MacGinitie, "An Introduction to Some Measurement Problems 
in Reading," Assessment Problems in Reading, Ed. Walter H. MacGinitie (Newark, 
Delaware: International Reading Association, 1973), pp. 1-7. 

2* Rnlph Hoepfner, et al., CSE Seoond(fry School Test Evaluations, (Los 
Angeles: Center for the Study of Evaluation, Graduate School of Education, 
University of California, 197j); and Ralph Hoepfner, et al., CSE-RBE Test 
Evaluations: Tests of Higher-Order Cognitive, Affective, and Interpersonal 
Skills (Los Angeles: Center for the Study of Evaluation, Graduate School of 
Education, University of California, 1972). 

Wayne Otto, Evaluation Instruments for Assessing Needs and Growth 
in Reading," Assessment Problems in Reading, Ed. Wa^*^" 
(Newark, Delaware: International Reading Association, 1973), pp. i't-^u. 




118 



104 



FOOTNOTES (Continued) 



. National Assessment of Educational Progress: General InformatioK Yearbook 
(Report //03/04-GIY, Washington, D.C.: U. S. Government Printing Office, 197^). 

27 

National Assessment of Educational Progress^ Reading and Literature: 
General Information Yearbook (Report #02-GIY, Washington, D.C,: U. S. 
Government Printing Office, 1972). 

28 

John S. Caylor, Thcmas G, Sticht, Lynn C. Fox, and J, Patrick Ford, 

Methodologies foi' Determining Reading Requirements of Military Occupational 
Specialties (Report # HumRRO-TR-73-5, Alexandria, Virginia; Human Resources 
Research Organization, 1973). Also Sticht, 1975, 

Richard T. Murphy, Final Report: Adult Functional Reading Study (Grant 
#OEC-0-70-A7 91(508), National Institute of Education, U. $• Department of Health, 
Education, and Welfare, 1973). 

Richard T. Murphy, Supplement to Final Report: Adult Functional Reading 
Study (Grant <^OEC-0-70-4791 (508), National Institute of Education, U.S. 
Department of Health, Education and Welfare, 1975), 



ERIC 



105 



119 



BIBLIOGRAPHY 



Adkins, Winthrop R, '*Llfe Skills Education for the Adult Learner," Adult 
Leadership, 22, No. 2 (1973), 55-58, 82-84* 

Bonnuth, John R. "Development of Readability Analyses." Report No. 7-0052, 
Chicago, Illinois: University of Chicago Press, 1969. 



— . "Development of Standards of Readability." Project No. 9-0237. 
Chicago, Illinois: University of Chicago Press, 1971. 



"Reading Literacy: Its Definition and Assessment." Reading Research 
Quarterly, 9, No. 1 (1974), 7-66. 

Bower, T. G. R. "Reading by Eye." Basic Studies in Reading. H. Levin and 
J. P. Williams, Eds. New York: Basic Books, 1970. 

Buros, Oscar Krisen. Tests in Print II: An Index to Tests, Test Reviews, and 
the Literature on Specific Tests. Highland Park, New Jersey: The Gryphon 
Press, 1974. 

} ' 

5 1 
f I 

Carver, Ronald P. "Reading as^ Reasoning: Implications for Measurement." 
Assessment Problems in Rdading. Walter H. hacGinitie, Ed. Newark, 
Delaware: International /Reading Association, 1973. 

Caughran, Alex M. and John A, Lindlof • "Should the ^Survival Literacy Stud^'^ 
Survive?". Journal of Heading, 15, No. 6 (1972), 429-435. / 

Caylor, John S. , Thomas G./sticht, Ljmn C. Fox, and J. Patrick Ford. Methodologies 
for Detemining Reading Requirements of Military Occupational Specialties. 
Report // HumRRO-TR-73-5, Alexandria, Virginia: Human Resources Research 
Organization, 1973. ' 

Fischer, Joan, Jane F. Flaherty, and Robert H. Arents. Testing Guidelines 

for Adult Basic Education and High School Equivalency Programs. Trenton, 
New Jersey: The Office of Adult Basic Education, Department of Education, 
1973. 

Gibson, Eleanor J. "Learning to Read." Theoretical Models and Processes of 
Reading. Harry Singer and Robert B. Buddell, Eds. Newark, Delaware: 
International Reading Association, 1970. 

Goodman, K. S. "Analysis of Oral Reading Miscues: Applied Psychollnguistics. " 
Reading Research Quarterly, 5, No. 1 (1969), 119--130. 

Gray, William S. The Teaching of Reading and Writing. Switzerland: United 
Nations Educational, Scientific and Cultural Organization, 1969. 

Harris, Louis, et al., "Survival Literacy Study." New York, 1970. ED 069 813. 



ERLC 



106 



123 



\ ■ 

"I 

! 

BIBLIOGRAPHY (Continued) 



. "The 1971 Reading Difficulty Index: A Study of Functional Reading 

Ability in the U. S," New York, 197|:, ED 057 312. 

Hoepfner, Ralph, et al., C^SE Secondary School Test Evaluations* Los Angeles: 
Center for the Study of Evaluation, Graduate School of Education, 
University of California, 1973. . . 

. CSE-RBS' Test Evaluations: Tests of Higher^Order Cognitive^ Affective^ 

and Interpe^rsonal rSHills. Los Angeles: Center for the Study of 
Evaluation, Graduate School of (Education, University of California, 1972. 

X Literacy Among Yc^'-^hs. 12^17 Years ^ United States. DREW Publication //(HRA) 74-1613, 
Baltimore, MarylSihd:;^-. Nat ional Center for Health Statistics, 1973. 

MacGinltie, Walter H. "An Introduction to Some Measurement Problems in Reading." 
. ' Assessment Problems in. Reading. Walter H. MacGinitie, Ed. Newark, 
Delaware: International Reading Association, 1973. 

Murphy, Richard T. Final Report: Adult Functional Reading Study* Grant // 

OEC-0-70-4791(508), National Institute of Education, ^IJ. S. Department of 
Health, Education and Welfare, 1973. / 

. Supplement to Final Report: Adult Functional Redding Study. Grant #^ 

OEC-0-70-479ir5O8) , National Institute of Education, U\ S. Department of 
Health, Education, and Welfare, 1975. 

national Assessment of Educational Progress: General Tnforma^tion Yearbook. 
Report //03/04-GIY, Washington, D.C.: U. S. Ctovemment Pfinting Office, 

1974. , ;\ 

\'\ 

National Assessment of Educational Progress^ Reading and Literature: General 
Information Yearbook. Report //02-GIY, Washington, D.C.: U. S. Government 
Printing Office, 1972. 

Northcutt, Norvell, Charles Kelso, and W. E. Barron* "Adult Functional 

Competency in Texas." Austin, Texas: University of Texas Press, 1975, 

Otto, Wayne. "Evaluation Instruments for Assessing Needs and Growth in 

Reading." Assessm^ent Problems in Reading. Walter H. MacGinitie, Ed., 
Newark, Delaware: International Reading Association, 1973. 

Request for proposal to collect and evaluate tests of functional adult literacy. 
Office of Planning, Budget and Evaluation, U. S. Office of Educat-f^n, 
September 1974. 

Statistical Abstract of the United States. Washington, D.C.: U. S. Government 
Printing Office, 1974. 

Sticht, Thomas G., et al. Auding and Reading: A Developmental Model. 
Alexandria, Virginia: Human Resources Research Organization, 1974. 



BIBLIOGRAPHY (Continued) 



• Reading for Wof*king:' A Functional Litemcy Anthology, Alexandria, 

Virginia: Human Resources Research Organization, 1975 • 

Vonderhaar, Kathleen, Donald W. Mocker, Robert E. Leibert, and Vera Maass* 

Tests for Adult^ Basic Education Teachers: ^^28 Suggestions for 
Classroom Teachers. " Kansas City, Missouri: Center for Resource 
Development in Adult Education, University of Missduri-Kansas City, 1975, 



•0 



125 

108 



